man sitemap (Commandes) - make a site map from meta tags in an HTML tree

NAME

sitemap - make a site map from meta tags in an HTML tree

SYNOPSIS

sitemap [start-dir | config-file]

DESCRIPTION

sitemap indexes all pages under the start directory and writes an HTML map page to standard output. The code looks for description information for each page in a META DESCRIPTION header; if it doesn't find one, the page is omitted from the index. That is, HTML pages to be indexed should have a meta tag with its name attribute set to description and its content attribute set to a brief description ofthe contents. For example,

<head> <title>Sitemap documentation</title> <meta name="description" content="Documentation for sitemap program to index HTML pages."> </head>

The output of sitemap is an HTML page that contains a list of descriptions and links to the indexed pages. This output can be configured via an rc file (see below).

ARGUMENTS

If no options are supplied, the start directory is the directory indicated by the DOCUMENT_ROOT or HOME environment variables, in that order. If neither variable is specified on a UNIX system, the effective user's home directory (as indicated in the passwd file) will be used. If a start-dir directory is supplied as an argument, then sitemap will look inside that directory for a .sitemaprc. (The effective start directory can still be overridden with the Startdir directive inside the configuration file.) If the configuration file does not exist, sitemap will run with a set of default parameters, which is usually not what you want.

If a config-file configuration file is specified, then the configuration for sitemap will be read from that file.

CONFIGURATION FILE

sitemap is a Python script. To configure the strings used in the index page header and footer, you can create a configuration file in your home directory called .sitemaprc (or as indicated by the command-line parameter). A skeleton of a configuration file is provided with the program. The file should start with the text [sitemap] on a line by itself. Subsequent lines should be name=value pairs. Lines beginning with the # character are treated as comments and are ignored. The possible field names in the configuration file are listed below:

Hometitle=title
The title of your homepage. The generated site map will contain a link with this text.
Homepage=url
The URL of your homepage. The generated site map will contain a link back to this page.
Indextitle=title
The title for the generated site map page.
Headinfo=any Html Text
Any additional HTML you want to include in the <head> section of the site map. Use with care - only certain tags are legal in the <head> of a page.
Encoding=encoding
The HTML encoding, such as iso-8859-1 or utf-8. If it is not specified, iso-8859-1 is used for all languages but Czech, where iso-8859-2 is used.
Startdir=directory
The root directory of the site to index. If it is not specified, the directory of the .sitemaprc configuration file is used.
Body=attributes
Any additional attributes to be included in the <body> tag.
Prefix=url
An optional URL prefix to put before each pathname (sitemap outputs each filename as a site-relative path beginning with a ``/''. If it is not provided, sitemap tries computing it by itself as follows. If the environment variable DOCUMENT_ROOT is set, and the start directory is a subdirectory of the document root, the prefix is the relative path from the document root to the start directory. Otherwise, sitemap it assumes that the start directory can be accessed with the URL ``/''. (That is, the start directory would be the directory indciated by the web server's DOCUMENT_ROOT.) If this is incorrect (e.g. you are indexing a user's home page whose URL begins with ``/~username'') you can supply the alternative URL prefix here.
Dirtitle=title
The title string to use for directories. Directories are listed and linked in the generated site map page with this text.
Fullname=name
Your full name. This name will be included in one corner of the generated site map page. You may want to list a company name or a copyright statement instead, for example.
Mailaddr=address
E-mail address of a contact person. Since the e-mail address will be linked on the generated site map page, you may want to set this parameter to the e-mail address of a contact person or a webmaster.
Language=language
The language for the boilerplate text included in the output (Czech, English, French, German, Italian, Norwegian, Spanish, or Swedish).
Icondirs=icon Path
The path (relative to the start directory or a URL) of the icon for directories. The icon must be 33 pixels wide (or scaleable to that size). If omitted, no icon will be displayed next to site map entries for directories.
Icontext=icon Path
The path (relative to the start directory or a URL) of the icon for HTML files. The icon must be 33 pixels wide (or scaleable to that size). If omitted, no icon will be displayed next to site map entries for HTML pages.
Indexfiles=file1 File2 File3
A space-separated list of files to treat as index or main pages for a directory. Any file with a filename exactly equal to one of the indicated filenames will be treated as an index page. Index pages sort to the top of the list of files in a directory. For example, index.html or default.htm might be good candidates for this parameter.
Exclude=word1 Word2
A space-separated list of words to ignore when scanning files and directories. sitemap will skip any file or entire subdirectories the contain any of the words in their path. For example, Test or CVS may be good candidates for this parameter.
Debug=y
Set this parameter to view the computed configuration file name, start directory, document root, and prefix in the generated site map page. You'll need to view the source of the generated HTML file because these values will be listed within and HTML comment. Search for the word Debugging in the generated HTML page.

USE UNDER CGI

You can use sitemap to generate site maps on the fly. Any command-line argument can be passed as the query string (i.e. a string immediately following the URL of the CGI script and a '?' character).

sitemap will deduce that it is running under the CGI by virtue of the fact that the REMOTE_ADDR environment variable is defined. If so, it outputs a content-type header (text/html) ahead of the HTML page.

When running as a CGI script, sitemap does not assume that the document root is necessarily identical with the start directory. It inspects the DOCUMENT_ROOT environment variable and constructs a prefix in an attempt to get from the server document root to the start directory. This will fail if the start directory is not a subdirectory under the document root, in which case the prefix directive in the configuration file should be used.

AUTHORS

Eric S. Raymond <esr@thyrsus.com>.

Immo Huneke <HunekeI@Logica.Com>.

Tom Bryan <tbryan@python.net>.

Modified for Debian by Aaron Isotton <aaron@isotton.com>.