man checklink (Commandes) - check the validity of links in an HTML or XHTML document
NAME
checklink - check the validity of links in an HTML or XHTML document
SYNOPSIS
checklink [ options ] uri ...
DESCRIPTION
This manual page documents briefly the checklink command, a.k.a. the W3C® Link Checker.
checklink is a program that reads an HTML or XHTML document, extracts a list of anchors and lists and checks that no anchor is defined twice and that all the links are dereferenceable, including the fragments. It warns about HTTP redirects, including directory redirects, and can check recursively a part of a web site.
The program can be used either as a command line tool or as a CGI script.
OPTIONS
This program follow the usual GNU command line syntax, with long options starting with two dashes (`-'). A summary of options is included below.
- -?, -h, --help
- Show summary of options.
- -V, --version
- Output version information.
- -s, --summary
- Show result summary only.
- -b, --broken
- Show only the broken links, not the redirects.
- -e, --directory
- Hide directory redirects - e.g. <http://www.w3.org/TR> -> <http://www.w3.org/TR/>.
- -r, --recursive
- Check the documents linked from the first one.
- -D, --depth n
- Check the documents linked from the first one to depth n (implies --recursive).
- -l, --location uri
- Scope of the documents checked in recursive mode. By default, for <http://www.w3.org/TR/html4/Overview.html> for example, it would be <http://www.w3.org/TR/html4/>.
- --exclude-docs regexp
- In recursive mode, do not check links in documents whose URIs match regexp.
- -L, --languages accept-language
- The CWAccept-Language HTTP header to send. In command line mode, this header is not sent by default. The special value CWauto causes a value to be detected from the CWLANG environment variable, and sent if found. In CGI mode, the default is to send the value received from the client as is.
- -q, --quiet
- No output if no errors are found.
- -v, --verbose
- Verbose mode.
- -i, --indicator
- Show progress while parsing.
- -u, --user username
- Specify a username for authentication.
- -p, --password password
- Specify a password for authentication.
- --hide-same-realm
- Hide 401's that are in the same realm as the document checked.
- -S, --sleep secs
- Sleep the specified number of seconds between requests to each server. Defaults to 1 second, which is also the minimum allowed.
- -t, --timeout secs
- Timeout for requests, in seconds.
- -d, --domain domain
-
Perl regular expression describing the domain to which the authentication
information (if present) will be sent. The default value can be specified
in the configuration file. See the CWTrusted entry in the configuration
file description below for more information.
Masquerade local dir as a remote URI. For example, the following results in
/my/local/dir/ being mapped to http://some/remote/uri/
--masquerade "/my/local/dir http://some/remote/uri/"
As of revision 3.6.2.19 of checklink, --masquerade takes a single argument consisting of two URIs, separated by whitespace. One usual way of providing a value with embedded whitespace is to enclose it in quotes. - -y, --proxy proxy
- Specify an HTTP proxy server.
- -H, --html
- HTML output.
FILES
- /etc/w3c/checklink.conf
-
The main configuration file. You can use the W3C_CHECKLINK_CFG environment
variable to override the default location.
CWTrusted specifies a regular expression for matching trusted domains
(ie. domains where HTTP basic authentication, if any, will be sent).
The regular expression will be matched case insensitively against host
names. The default behavior (when unset, that is) is to send the
authentication information only to the host which requests it; usually
you don't want to change this. For example, the following configures
only the w3.org domain as trusted:
Trusted = \.w3\.org$
CWAllow_Private_IPs is a boolean flag indicating whether checking links on non-public IP addresses is allowed. The default is true in command line mode and false when run as a CGI script. For example, to disallow checking non-public IP addresses, regardless of the mode, use:Allow_Private_IPs = 0
CWMarkup_Validator_URI and CWCSS_Validator_URI are formatted URIs to the respective validators. The CW%s in these will be replaced with the full URI encoded URI to the document being checked, and shown in the link checker results view in the online/CGI version. The defaults are:Markup_Validator_URI = http://validator.w3.org/check?uri=%s CSS_Validator_URI = http://jigsaw.w3.org/css-validator/validator?uri=%s
CWDoc_URI and CWStyle_URI are URIs used for linking to the documentation and style sheet from the dynamically generated content of the link checker. The defaults are:Doc_URI = http://validator.w3.org/docs/checklink.html Style_URI = http://validator.w3.org/docs/linkchecker.css
ENVIRONMENT
checklink uses the libwww-perl library which has a number of environment variables affecting its behaviour. See SEE ALSO for some pointers.
- W3C_CHECKLINK_CFG
- If set, overrides the path to the configuration file.
SEE ALSO
The documentation for this program is available on the web at <http://validator.w3.org/docs/checklink.html>.
LWP, Net::FTP, Net::NNTP, Net::IP, perlre.
AUTHOR
This program was originally written by Hugo Haas <hugo@w3.org>, based on Renaud Bruyeron's checklink.pl. It has been enhanced by Ville Skyttä and many other volunteers since. Use the <www-validator@w3.org> mailing list for feedback, and see <http://validator.w3.org/docs/checklink.html#csb> for more information.
This manual page was written by Frédéric Schütz <schutz@mathgen.ch> for the Debian GNU/Linux system (but may be used by others).
COPYRIGHT
This program is licensed under the W3C® Software License, <http://www.w3.org/Consortium/Legal/copyright-software>.