man dirvish.conf (Formats) - dirvish configuration file.
NAME
dirvish.conf - dirvish configuration file.
DESCRIPTION
The dirvish.conf file provides configuration information and default values for dirvish.
The file format is fairly simple.
Each option requires either a single-value
or a list of values
and unless otherwise indicated
must be specified according to its expected type.
Single value options are specified by lines of the form
option: value.
Options expecting list must be specified in a multi-line
format as shown here
where the lines specifying values are indented by any
kind of whitespace even if only one value is being specified.
option:
value1
value2
.
.
.
valueN
Each value must be provided on its own line.
Any leading and trailing whitespace is discarded.
Options whose names with an initial capital (ex: Foo)
are discarded by dirvish itself but may be used by support utilities.
Blank lines are ignored.
While this simplistic format may allow for configuration errors it allows arbitrary options to be declared that custom support scripts could use.
A # introduces a comment to the end of the line.
On startup the dirvish utilities will first load a master dirvish.conf file. /etc/dirvish.conf will be tried first but if not present /etc/dirvish/master.conf will be tried.
During installation dirvish may have been configured expect the system-wide configuration files in some location other than /etc/dirvish.
Multiple configuration files will be loaded by the and command-line options as well as the config: and client: configuration parameters. To prevent looping each configuration file can only be loaded once.
DIRVISH OPTIONS
Like the command line each option may be specified any number of times. Those options that expect lists will accumulate all of their arguments and for single value options each specification will override the ones before.
Boolean values need to specified as 1 or 0 or may be specified using SET or UNSET. Some Boolean values are set by default and must be explicitly unset if unwanted.
Each option is marked here with one of (B) for Boolean, (S) single value, (L) list or (0) other.
- Set or unset one or more boolean options.
NOTE: The SET and UNSET directives do not use colons <:>.
- Reset a list option so that it contains no values.
This may be used to start specification of the option.
NOTE: The RESET directive does not use a colon <:>.
- Specify paths to directories containing vaults.
A [bank] is a directory containing one or more [vault]s. The system supports multiple [bank]s so that filesystem mount-points can be managed more effectively.
When a [vault] is specified the [bank]s will be searched in list order until the [vault] is found. This way [vault]s can be moved between [bank]s or added without having to update a master index.
- Specify a [branch] to use.
A [branch] is a sequence of [image]s.
This also specifies a default value for reference:.
- Specify a default [branch] to use.
- specify a client to back up.
Setting this to the same value as hostname will cause dirvish to do a local copy and stay off the network. This automatically invokes whole-file.
The first time this parameter is set /etc/dirvish/client_name or /etc/dirvish/client_name.conf will be loaded.
- Force the checksum comparison of file contents even when the inode fails to indicate a change has occurred.
- Load configuration file.
Similar to #include, filename or filename.conf will be loaded immediately.
If filename is a relative path it will be looked for in the vault and then the system-wide configuration directories.
- If this is unset device special files will be excluded from backups.
This may need to be unset when doing backups of where the client OS uses device numbers or types unsupported by the server OSs or where the presence of device nodes in the vault present a security issue.
- Specify a filename patterns to exclude.
Patterns are based on shell glob with some enhancements.
- Load a set of patterns from a file.
If filename is a relative path it will be looked for in the vault and then the system-wide configuration directories.
- Specify a time for the [image] to expire.
This does not actually expire anything. What it does do is add an Expire: option to the [image] summary file with the absolute time appended so that dirvish-expire can automate old [image] removal.
- Specify a default expiration time.
This value will only be used if expire is not set and expire-rule doesn't have a match.
- specify rules for expiration.
Rules are specified similar to crontab or in Time::Periodformat.
- Specify a name for the [image].
image_name is passed through POSIX::strftime
- Set the default image_name.
This value will only be used if image: is not set.
- Set the permissions for the [image].
While the [image] is being created the [image] directory permissions will be 0700. After completion it will be changed to octal_mode or 0755.
- Time to use when creating the [image] name.
If an absolute time without a date is provided it will be forced into the past.
If this isn't set the current time will be used.
- Temporary directory name to use for new [image]. This allows you to have [image]s created with the same directory name each run so that automatic processes can access them.
The next time an image is made on the [branch] this option will cause the directory to be renamed to its official name.
- index: none|text|gzip|bzip2 (S)
- Create an index file listing all files in the [image].
The index file will be created using find -ls so the list will be in the same format as ls-dils with paths converted to reflect the source location.
If index is set to bzip2 or gzip or a path to one the index file will be compressed accordingly.
This index will be used by dirvish-locate to locate versions of files.
- Create an initial [image].
Turning this on will prevent backups from being incremental.
- log: text|gzip|bzip2 (S)
- Specify format for the image log file.
If log is set to bzip2 or gzip or a path to one the log file will be compressed accordingly.
- Set the permissions for the [image] meta-data files.
If this value is set the permissions of the meta-data files in the [image] will be changed after the [image] is created. Otherwise the active umask will prevail.
SECURITY NOTE: The log, index, and error files contain lists of files. It may be possible that filenames themselves may be or contain confidential information so uncontrolled access may constitute a security weakness.
- Use numeric uid/gid values instead of looking up user/group names for setting permissions.
- Specify file containing password for connection to an rsync daemon on backup client.
This is not useful for remote shell passwords.
- Preserve file permissions. If this is unset permissions will not be checked or preserved.
With rsync version 2.5.6 not preserving permissions will break the linking. Only unset this if you are running a later version of rsync.
- Execute shell_command on client or server before or after making backup.
The client commands are run on the client system using the remote shell command (see the rsh: parameter).
The order of execution is pre-server, pre-client, rsync, post-client, post-server. The shell_command will be passed through strftime(3) to allow date strings to be expanded.
Each pre or post shell_commands will be run with these environment variables DIRVISH_SERVER, DIRVISH_CLIENT, DIRVISH_SRC, DIRVISH_DEST and DIRVISH_IMAGE set. The current directory will be DIRVISH_SRC on the client and DIRVISH_DEST on the server. If there are any exclude patterns defined the pre-server shell command will also have the exclude file's path in DIRVISH_EXCLUDE so it may read or modify the exlude list.
STDOUT from each shell_command will be written to the [image] log file.
The exit status of each script will be checked. Non-zero values will be recognised as failure and logged. Failure of the pre-server command will halt all further action. Failure of the pre-client command will prevent the rsync from running and the post-server command, if any, will be run.
Post shell_commands will also have DIRVISH_STATUS set to success, warning, error, or fatal error.
This is useful for multiple things. The client shell_commands can be used to stop and start services so their files can be backed up safely. You might use post-server: to schedule replication or a tape backup of the new [image]. Use your imagination.
- Specify an existing [image] or a [branch] from which to create the new [image].
If a branch_name is specified, the last existing [image] from its history file will be used. A [branch] will take precedence over an [image] of the same name.
If this isn't specified the [branch] name will be used as a default value.
- Remote shell utility.
This can be used to specify the location of ssh or rsh and/or to provide addition options for said utility such as for ssh to use an alternate port number.
If not specified ssh will be used.
This remote shell command will be used not only as the default rsync transport but also for any pre-client and post-client commands.
- Path to rsync executable on the server.
- Path to rsync executable on the client.
- Specify additional options for the rsync command.
Only one option per list item is supported.
This allows you to use rsync features that are not directly supported by dirvish. Where dirvish does support an rsync feature it is probably better to use the the dirvish supplied mechanism for setting it.
- Try to handle sparse files efficiently so they take up less space in the [vault].
NOTE: Some filesystem types may have problems seeking over null regions.
- Specify a maximum transfer rate.
This allows you to limit the network bandwidth consumed. The value is specified in approximate Mega-bits per second which correlates to network transport specifications. An adaptive algorithm is used so the actual bandwidth usage may exceed Mbps occasionally.
- Have rsync report transfer statistics.
- Specify summary format.
A short summary will only include final used values. A long summary will include all configuration values.
With long format you custom options in the configuration files will appear in the summary.
The default is short.
- Specify a directory path on the client to backup.
If path is prefixed with a colon the transfer will be done from an rsync daemon on the client otherwise the transfer will be done through a remote shell process.
The optional alias specifies the path that should appear in the index so dirvish-locate will report paths consistant with common usage. This can help reduce confusion when dealing with users unfamiliar with the physical topology of their network provided files.
- Don't actually do anything.
Process all configuration files, options and tests then produce a summary/configuration file on standard output and exit.
I can't think why you would do this in a configuration file but if you want to shoot yourself in the foot, be my guest.
- Specify the [vault] to store the [image] in.
Although multiple [vault]s may share a filesystem a given [vault] cannot span filesystems. For filesystem purposes the [vault] is the level of atomicity.
This will seldom be specified in a configuration file.
- Transfer whole files instead of just the parts that have changed.
This may be slightly faster for files that have more changed than left the same such as compressed or encrypted files. In most cases this will be slower when transferring over the network but will use less CPU resources. This will be faster if the transfers are not over the network or when the network is faster than the destination disk subsystem.
- Do not cross mount-points when traversing the tree on the client.
- Enable compression on data-transfer.
SCHEDULING OPTIONS
- Location of dirvish executable.
If not set defaults to dirvish.
- How often this backup is allowed to run.
If the time the last [image] of this [branch] was created is more than parsedate_expression old and we are within a time Window it may commence a backup.
- Set a relative load value for a job.
The load that a job places on the server will vary depending on the frequency of file changes, end-to-end network bandwidth (and speed-limit), and the processing speed of the client.
- Set the default value for Load.
This option can only be set in the master configuration file and if left unset will default to 100.
- Set the maximum number of simultaneous jobs permitted.
When set in the master configuration file this applies to the server. When set in the client config file this will limit only limit the number of simultaneous jobs on that client.
- Set the maximum load permitted.
The total load_units of all jobs running will not exceed this value. If not set no load limiting will be done.
When set in the master configuration file this applies to the server. hen set in the client config file this will limit only limit the load of simultaneous jobs on that client.
- Set a priority value for a job.
Relative priorities will be used in scheduling jobs. }
- Specify [branch]es to be scheduled for automated backups.
Each value is specified in the form
vault:branch [image_time]
If image_time is set here it will be used.
This option can only be set in the master configuration file and multiple values will accumulate.
- Specify [branch]es to be scheduled for automated backups.
This option can only be set in the master configuration file and multiple [branch]es will accumulate.
- time pattern expression for scheduling backups.
The time_patterns will be tested and if any one matches the current time and the last [image] is old enough it may commence a backup.
See EXPIRE RULES for details of time_pattern expressions.
Multiple patterns will accumulate so if a client or [branch] requires more restrictive windows use RESET. }
EXPIRE RULES
Expire rules is a list of rules used to determine an expiration time for an [image].
The last rule that matches will apply so list order is significant. This allows rules to be set in client, [vault] and [branch] configuration files to override rules set in the master configuration file without having to use RESET. In most cases it is better to use a expire-default: value than to define a rule that matches all possible times.
Each rule has an pattern expression against which the current time is compared followed by a date specifier in Time::ParseDate format.
A matching rule with an empty/missing date specifier or specifying never will result in no expiration.
The time pattern expression may be in either crontab or in Time::Period format.
The crontab formated patterns are converted to Time::Period format so the limitations and extensions for the specification of option values of Time::Period apply to the crontab format as well. Most notable is that the days of the week are numbered 1-7 for sun-sat so 0 is not a valid wday but sat is.
Here are two equivalent examples of an expire-rule list.
expire-default: +5 weeks expire-rule:
#MIN HR DOM MON DOW EXPIRE * * * * 1 +3 months * * 1-7 * su +1 year * * 1-7 1,4,7,10 1 never * 10-20 * * * +10 days or: wd { sun } +3 months wd { sun } md { 1-7 } +1 year wd { 1 } md { 1-7 } mo { 1,4,7,10 } never hr { 10-20 } +10 days
This describes is an aggressive retention schedule. If the nightly backup is made dated the 1st Sunday of each quarter it is is kept forever, the 1st Sunday of any other month is kept for 1 year, all other Sunday's are kept for 3 months, the remaining nightlies are kept for 5 weeks. In addition, if the backup is made between 10AM and 8PM it will expire after 10 days. This would be appropriate for someone with a huge backup server who is so paranoid he makes two backups per day. The other possibility for the hour spec would be for ad-hoc special backups to have a default that differs from the normal dailies.
It should be noted that all expiration rules will do is to cause dirvish to put an Expire: option in the summary file. The dirvish-expire utility will have to be run to actually delete any expired [image]s.
FILES
- /etc/dirvish/master.conf
- alternate master configuration file.
- /etc/dirvish.conf
- master configuration file.
- /etc/dirvish/client[.conf]
- client configuration file.
- bank/vault/dirvish/default[.conf]
- default vault configuration file.
- bank/vault/dirvish/branch[.conf]
- branch configuration file.
- bank/vault/dirvish/branch.hist
- branch history file.
- bank/vault/image/summary
- image creation summary.
- bank/vault/image/log
- image creation log.
- bank/vault/image/tree
- actual image of source directory tree.
- bank/vault/image/rsync_error
- Error output from rsync if errors or warnings were detected.
SEE ALSO
dirvish(8) dirvish-expire(8) dirvish-runall(8) dirvish-locate(8) ssh(1), rsync(1) Time::ParseDate(3) strftime(3)
AUTHOR
Dirvish was created by J.W. Schultz of Pegasystems Technologies.
BUGS
Rsync version 2.5.6 has a bug so that unsetting the perms option will not disable testing for permissions. Disabling perms will break image linking.
Options set in configuration files will override command line options that have been set before the file is read. This behaviour while consistent may confuse users. For this reason the more frequently used command line options have options paired with a default option so the order of specification will be more forgiving. It is recommended that where such default options exist in configuration files they should be preferred over the primary option.
It is possible to specify almost any command line option as a option. Some of them just don't make sense to use here.