ftpcopy - create and maintain a ftp mirror.

SYNOPSIS


ftpcopy [options] host[:port] remotedir [localdir]
   or: ftpcopy [options] ftp://host[:port]/remotedir [localdir]

DESCRIPTION


ftpcopy copies a FTP site recursivly. It afterwards deletes all files in the local directory tree which were not found on the remote site.

local-directory defaults to `.' - the current working directory - if the --no-delete option is used. local-directory is not needed if the --interactive option is used. Otherwise you must provide a local-directory argument.

OPTIONS


Connect / login / username / password options:

-u, --user=NAME
Use NAME to login on the ftp server. The default is `anonymous'. Use an empty name to force the program to not log in.
-p, --pass=PASSWORD
Use PASS as password to login on the ftp server. The default is `anonymous@invalid.example'. If an empty password is given the programm will not send a password to the server.
--account=ACCOUNT
Send ACCOUNT as account name during login phase. Note: this is _not_ the user name, but the name of what could be called a subaccount implemented by a few servers. If you don't understand what it means you have a good chance to never need this option anyway. If you think you need it please try the --user option first.
--tries=ARG
Number of tries to connect and log in. The default is 1, meaning that the program will give up after the first error. This option was added in version 0.3.0.
--data-connect-retries=ARG
Number of tries to connect to data port. The program will try to reach the data port (for retrieval of listings or data) that many times and will give up after that many errors in a row. The default is 5, meaning that the program will give up after the fifth error. This option was added in version 0.6.6. The old behaviour was to give up after the first error.
--login-sleep=ARG
Seconds to sleep after a failed login. More precisely: the program will fall to sleep for this many seconds after a try to connect or login has failed. The default is 5. A 0 is treated as 1, and abuse, especially together with --tries, is likely to annory the servers adminstrators. This option was added in version 0.4.5.
-4, --v4
Only use IPv4, even if v6 is available. This option effectively disallows the use of IPv6, except for DNS queries. It was added in version 0.6.0.
-6, --v6
Only use IPv6, even if v4 is available. This option effectively disallows the use of IPv4, except for DNS queries. It was added in version 0.6.0.

Verbosity options:

-l, --loglevel=ARG
Controls the amount of logging done.
  0: nothing except warnings and error messages.
  1: downloads and deletes (this is the default).
  2: links/symlinks created, files we already got.
  3: useless stuff.
--bps
Log transfer rates. This option causes ftpcopy to log byte / kilobyte / megabyte per second information after successful transfers. This option was added in version 0.3.9.
--progress
Report progress to stderr. This will print a report of the download every second: a short form of the file name, the bytes got and expected and the percentage received. This option was added in version 0.6.0.

File selection options:

-m, --max-days=DAYS
Download only files modified in the last DAYS. Locally existing copies of the not downloaded files will be kept. The default is not to restrict the age of files.
--max-size=MAXBYTES
Download only files up to MAXBYTES length. Locally existing copies of overlong files will be deleted during the clean-up step. The default is not to restrict the file size. This option was added in version 0.5.1.
-x, --exclude=WILDCARD
Exclude paths matching WILDCARD. If WILDCARD matches the full path of the remote file then the file will not be downloaded. WILDCARD is a shell style wildcard expression, not a regular expression like those of grep. You can repeat this option as often as you want, and you can intermix it with the --include option. If both includes and excludes are used then the last matching one will be honored. The list starts with an implicit '--include *'. If the --tolower option is used together with --exclude or --include then the in/exclude patterns have to be written in lower case. This option was added in version 0.3.0.
-i, --include=WILDCARD
Include paths matching WILDCARD. This is the opposite of the --exclude option. It was added in version 0.3.0.
-X, --in-exclude-file=FILE
Read in/exclude patterns from FILE. The include and exclude patterns are read from a file. If the first character of a line is a '+' the remainder of the line is treated as an argument of a --include optiona and if it is a '-' it is treated as an argument to a --exclude option. Lines starting with a '#' are ignored. FILE will be read after any --include and --exclude options given on the command line have been read. This option was added in version 0.6.6.
--ignore-size
Ignore file size. Do not compare file sizes when checking the remote file has to be downloaded. This option was added in version 0.4.4.
--ignore-time
Ignore modification times. Do not compare file modification times when checking the remote file has to be downloaded. This option may be combined with --ignore-size, in which case a file will never be downloaded regardless of changes in file size or modification time. In other words: ftpcopy will not download any updates. This option was added in version 0.4.4.
--max-depth=ARG
Descend at most LEVEL directories.
  0 means `do not enter sub directories at all',
  1 means `enter sub-directories, but not their sub-directories'.
The default is 2^32-1 meaning `enter all'.

Deletion options:

-n, --no-delete
Do not delete files. This influences the cleanup step when getting rid of things the server doesn't have anymore. It does not stop ftpcopy from deleting files when it detects something in it's way during a download.
-M, --max-deletes=COUNT
Do not delete more then COUNT files. This option may be useful to limit the impact of a tempoary loss of files on the server. This only influences the cleanup step and does not stop ftpcopy to delete files in it's way during a download. The default is 0, meaning unlimited. This option was added in version 0.4.5.

Operational options:

-d, --directories-only
Only create the directory hierarchie. Do not download files. Any file in the tree will be deleted unless the -n option is also given. This option will be removed in future versions, unless someone objects.
--dry-run
Don't do anything. ftpcopy will only show what would be done. This option was added in version 0.3.6.
-T, --timeout=SECONDS
Timeout to use for network read/write. The default is 30 seconds and is usually sufficient. This option was added in version 0.3.8.
--rate-limit=BYTES_PER_SECOND
Limit file download speed. Limit the transfer rate of file downloads to about that many bytes per seconds. The implementation is crude and simple, by sleeping up to one second between network reads, and therefore does not even try to limit the rate exactly to that number. On the other hand it usually works and is unlikely to break things by causing timeouts. The default is unlimited. This option was added in version 0.4.7.
--interactive
Read directories from stdin. This option tells ftpcopy to ignore any directories given on the command line, and to read commands from the standard input. Each command consists of two lines, the first being a directory on the remote server, and the second a local directory. ftpcopy will print an END-OF-COPY line after each operation. This option was added in version 0.3.6 and will be removed in future versions, unless someone objects.

Workaround options:

--ascii-listings
Do directory listings in ASCII mode. Use this option if the FTP server is unable to correctly list directories in binary mode, for example, if you see a message like this (usually on one line): `fatal: received unwanted answer to LIST: 426 Data connection: Illegal seek.' This option was added in version 0.5.2.
-L, --list-options=OPTS
Add OPTS to LIST command. This allows to pass arbitrary options to the FTP servers LIST command. Note that ftpcopy does not cope well with recursive directory listings. This option was added in version 0.3.0.
-s, --symlink-hack
Deal with symbolic links. This is only useful to mirror sites which create listings through /bin/ls, and will fail if a file name in a link contains a ` -> ' sequence.
--force-select
Use select, not poll. Do not use the poll() system call even if it's available, but use select() instead. This allows the program to be used together with runsocks from the socks5 reference implementation. Please note that you will need a directly reachable name server anyway, as the DNS library in use does not support SOCKS (you can always use IP addresses). This option was added in version 0.3.8.
--mdtm
Use the MDTM command to get the remote time. The default is to take the times from the directory listings. This doesn't work if the server implements an inferior listing format (most do) and doesn't send time stamps in universal coordinated time (UTC). The damage caused by this is limited to file time stamps being wrong by a few hours. This option makes ftpcopy send a MDTM command for any file it might want to download. The drawback is that this eats performance: ftpcopy usually sends just one command for a complete directory its traverses. With the --mdtm option it has to send an additional command for any file. This option was added in version 0.3.10.
--allow-pasv-ip=IP4
Allow data connections to the address IP4. Normally ftpls only accepts data connections to the IP addresses it received as an answer to the DNS request, or the IP address in the URL. Sometimes this is not enough, especially when NAT or masquerading are active. ftpcopy then prints an error message `illegal redirect by FTP server'. With this option, which may be given more than once, you can add additional addresses to the internal list of allowed data connection targets. IP4 has to be an IPv4 address or a list of IPv4 addresses, separated by commas. The environment variable FTPCOPY_ALLOW_PASV_IP serves the same purpose. Note: Do not use this option without thinking: FTP redirects may be used to launch denial of service attacks against innocent targets. This option was added in version 0.6.1.
--no-resume
Do not try to resume downloads. The REST command, needed to resume a failed download, is badly specified and likely to be misinterpreted and -implemented. Use this option in case of trouble. This option was added in version 0.6.0.
--tolower
Change all local file names to lowercase. Use this only if you are absolutely sure that the remote side does not contain any files or directories whose lower cased names collide with each other. Otherwise this option will waste bandwidth. If this option is used together with the --exclude or --include options then the in/exclude patterns have to be written in lower case. This option was added in version 0.3.8.

Help options:

--include-exclude-help
How --include and --exclude work.
--examples
Show usage examples.
--see-also
Where to find related information.
--version
Show version: ftpcopy (ftpcopy) 0.6.7.
--help
Show a list of options or the long help on one. The use with an argument shows the long help text of that option, without an argument it will list all options.
--longhelp
Show longer help texts for all or one option.

EXAMPLES


mirror cr.yp.to:
  ftpcopy  \\
  --exclude '*.cdb'  \\
  --exclude '*software/precompiled*' \\
  cr.yp.to / /private/file/0/mirror/cr.yp.to
This means:
  * i'm not interested in .cdb files.
  * precompiled stuff is also not downloaded.
  * the host to connect to is cr.yp.to.
  * the remote directory is /, and
  * and /private/file/0/mirror/cr.yp.to is the local directory.

IN/EXCLUDE


In- and exclude lists are internally mixed together, keeping the order in which they were given. The list starts with an implicit `include *'. ftpcopy honors the last match.

The wildcard matching is done against the full remote path of the file. The `/' character has no special meaning for the matching and is treated like any other.

Note: you have to include top level directories of files or directories you want to include. Something like this will NOT work:

    --exclude '*' --include '/w/h/e/r/e/file.c'
You need to include /w, /w/h and so on.

COPYRIGHT


Copyright (C) 2003 Uwe Ohse.

The software comes with NO WARRANTY, to the extent permitted by law.

This package is published unter the terms of the GNU General Public License version 2. Later versions of the GPL may or may not apply, see http://www.ohse.de/uwe/licenses/

AUTHOR


Uwe Ohse, uwe@ohse.de.

MORE INFORMATION


Please report bugs to ftpcopy@lists.ohse.de

SEE ALSO


ftpls(1) lists ftp directories. ftpcp(1) is a frontend for ftpcopy.

The ftpcopy package has a mailing list. Send an empty mail to ftpcopy-subscribe@lists.ohse.de to subscribe to it.

The ftpcopy homepage is at http://www.ohse.de/uwe/ftpcopy.html