man domesday.project (Formats) - Files which determine how Domesday should create an index.

NAME

Domesday PROJECT FILES - Files which determine how Domesday should create an index.

DESCRIPTION

The Domesday program uses project files to determine how it should generate indices. Once a project file has been set up, it is possible to get Domesday to run autonomously, perhaps as parts of a weekly cron script. There are two ways in which a project file can be set up: using the graphical indexgengui program; or by editing project file templates directly. The gui is highly recommended as it includes far more documentation than in the template files.

This document describes the project settings files.

FILE FORMAT

Project settings files are simple text files as created by your favourite text editor (vim, emacs or Windows Notepad). They can have any name you choose, but this file name must be supplied to the Domesday program when it is run.

Settings are all key-value pairs, often of varying length. In some cases, the test of a setting has to be one of a number of predefined values; these will be listed below. Shorter settings can be written on a single line in the form

setting_name = setting_value

Longer settings values, and values requiring multiple lines may be written in the following form:

settings_name)

value line 1

value line 2

...

;;

Domesday will accept either form (long or short) for every setting, so it is left entirely the choice of the user as to which format is used.

Comments

The settings file includes comments, used in the template to give a short reminder of what each setting does. Comments begin with the % character and continue to the end of the line.

PLACEHOLDERS

The following section explains what settings must appear in the project file. Many of the settings are for code which should be copied directory to the output file. In order to give you as much power as possible in controlling to final sitemap, we allow these to be in any format. Details extracted by index generator will be inserted in your text in place of placeholders.



<IG Field=" placeholder name" attr1=" attrib 1" attr2=" attrib2" />

Placeholder name determines what data should be inserted in place of this tag. The placeholder name has to be one of the predefined values, although the search routine is case insensitive, so the upper/lower case characters do not have to be strictly followed. If a placeholder name is not recognised by Index Generator (e.g. if it was mistyped), an error will be emmitted on the console, but the index generation process will continue with the placeholder being replaced by a html comment explaining that it was not understood.

The key/value pairs of attributes in the placeholder are entirely optional and depend on the placeholder being used. Typically, these are used to give additional formatting information. A number of placeholders are valid only in a particular place in the index. These will be detailed in the section approrpriate to that settings. Many placeholders, however, are valid at any point in the settings file; others are valid at a number of places. These are detailed below.

GenDate placeholder

This adds the date or time when the index was generated. This is useful so that visitors can see whether the index is likely to be correct and it also gives an idea of how much care goes into the website. If you have Domesday run automatically at regular intervals, e.g. weekly, this will probably impress visitors.

With no attributes, this prints a system standard date. Alternatively, you can specify a format attribute with a string determining exactly how the date should be displayed. The attribute takes the form of a simple string with fields for each part of the date to be printed. The possible fields are:



Field Full Form Short Form

Year yyyy (4 digits) yy (2 digits)

Month MMMM (name) MM (2 digits), M (1 or 2 digits)

Day of week EEEE EE

Day of Month dd(2 digits) d (1 or 2 digits)

Hour (1-12) hh h

Hour (1-24) kk k

Minute mm

Second ss

Millisecond SSS

AM/PM a

Time Zone zzzz zz

Day of Week in Month F (e.g. 3rd Thursday)

Day in Year DDD D (1, 2 or 3 digits)

Week in Year ww

Era G (BC/AD)

IndexGenURL placeholder

This is replaced by the URL to the program's website. Please consider including a link to the site so that other people may find out about the program. This placeholder takes no options.

IndexGenVersion placeholder

This is replaced by the version string of the program used to generate the index. It takes no options.

OutputName placeholder

This is replaced by the name of the outputfile, not including the path

OutputPath placeholder

This is replaced with the path of the output file on the local filesystem.

FileCount placeholder

This is replaced by the number of files included in the index.

FILE PLACEHOLDERS

The following are valid in settings which deal specifically with a single file being indexed, e.g. smLinkTxt, olLinkTxt. Note that they may not be present in some file types. In files with badly defined fields, the program will try to replace the placeholders with something useful. For example, if an html file doesn't contain a meta description field, text from the first paragraph will be used as the description.

Author placeholder

Title placeholder

Description placeholder

Keywords placeholder

This has the option separator which determines what string should be placed between each keyword. Currently the default is `, ' although that cannot be guaranteed

FileSize placeholder

This is replaced with the size of the file, in Bytes by default. There is a single option for this, format, which determines how the field should be formatted. It takes the following values:

bytes - Prints the size of the file in Bytes kilobytes - Prints the size in Kilobytes (1024 Bytes) megabytes - Prints the size in Megabytes (1024 KB) HumanReadable - Prints the size of the file in a form which is easily readable, including a suffix. (e.g. 978 B, 12.9 MB).

FileType placeholder

Currently, this is replaced by a string representing the type of file (e.g html). In the future, this will probably be changed to have more information regarding the file, e.g. baseType html, version 4.0.1 transitional, charset... If you particularaly want this feature, please get in contact with us.

Parser placeholder

This is replaced with a string description of the Domesday parser which was used to extract the details from the file. It is probably only useful for debugging.

FileName placeholder

This is replaced by the name of the file, not including path. In the case of URL's which don't include a file name, it will be the last part of the path.

RealLocation placeholder

This is replaced by the location of the file as Domesday found it.

RelLocation placeholder

This is replaced by the location of the file relative to the location of the output file. If that cannot be determined, the absolute location will be given instead.

PROJECT FILE SETTINGS

locale

The first setting in the file should be for the Locale. This determines what language the setting file has been created in. The names of the settings are different for each Locale. If you do not wish to create a settings file in the English locale, please look at the appropriate Domesday.project.language manual page.

settingsVersion

This is the version of the settings being used. When we change the settings file format, we will increase this version and also document the changes between the various version so that you can easily upgrade your files.

INDEX FILE SETTINGS

These determine how to find files to include in the index and also where the output should be placed.

getMethod

todo

fileSystemSearch

todo

scanRoot

todo

httpHeaders

todo

scanIncludeFilters

todo

scanExcludeFilters

todo

indexIncludeFilters

todo

indexExcludeFilters

todo

outputFileName

todo

PROJECT DETAILS

These settings ... todo

indexType

todo

filesToCopy

todo

fileCopyTarget

todo

SITEMAP SETTINGS

This section of settings only apply if the sitemap index type was chosen. If

it was not, this section can be ignored.

smStartTxt

todo

Sitemap Folder Settings

The following settings are used to create the hierarchical format for the site map. They are allowed to use the additional placeholders FullRelFolder and TopFolder todo

smEnterFolder

todo

smLeaveFolder

todo

smLinksTxt

todo

smEndTxt

todo

ORDERED LIST SETTINGS

todo

olSortKey

todo

olArticles

todo

olStartTxt

todo

olNavStart

todo

olNavLinks

todo

olNavEnd

todo

olSecStart

todo

olSecEnd

todo

olLink

todo

olEndTxt

todo

GUI SETTINGS

These settings are used by the gui, and so are probably not useful when editing the files manually

projectName

todo

created

todo

loadAsTheme

todo

fileCount

todo

genCount

todo

VIM SYNTAX FILE

Users of the Vim text editor might like to have syntax highlighting while editing project files. To do this, they should copy the included indexgen.vim syntax file to ~/.vim/syntax/. If the settings file template is used, this will be loaded automatically. If not, then type the vim command :se syntax=indexgen.

EXAMPLES

The GUI allows users to select template files to determine the intial look of the index before continuing to define any customisations to these. On most systems, these can be found in /usr/share/Domesday/templates/ todo

SEE

AUTHOR

This manual page was written by Mark Howard <mh@tildemh.com>