man bindrules (Conventions) - ShapeTools version bind rules

NAME

BindRules - ShapeTools version bind rules

DESCRIPTION

The ShapeTools version binding subsystem (see vbind(1)) provides a mechanism for expressing general version bind rules. These rules describe on an abstract level version properties, which will be matched against the properties of concrete versions during the version bind procedure. The goal is to select one or more versions from a named history in order to provides access to these version(s). A version bind operation is always performed for exactly one history at a time. Version bind rules express something like

	Select the most recent saved version.
	If there is no saved version, select the busy version.

ShapeTools however needs rules in a more formal notation to be able to interpret them. Let's see, how the rule above is translated into the formal notation.

Version bind rules consist of a list of attribute expressions evaluated one after another until one of the expressions leads to a unique version identification. The expressions are separated by semicolons, the last expression ends with a period. The rule from above will now read:

	Select the most recent saved version CB;
	Select the busy version CB.

Each attribute expression consist of a list of predicates, separated by commas. The predicates are evaluated from left to right resulting in a hit set, a set of versions fulfilling all predicates evaluated so far. The initial hit set for an attribute expression contains all versions of the name to be bound. Each predicate potentially narrows the hit set. The predicates in our rule are:

	all saved versions CB, most recent version CB;
	busy version CB.

Remember, that each predicate bases it's selection on the hit set left by the predicate before. Hence exchanging the two predicates in the first attribute expression may lead to different results. We will give more information on this topic in the section about the evaluation algorithm below. We now reach the final form of ShapeTools version bind rules. The predicates must be taken from a list of predefined names and be equipped with arguments:

	ge (status, saved) CB, max (stime) CB;
	eq (status, busy) CB.

That's it so far. This is a rule how ShapeTools understands it. It does however illustrate just a small piece of the world of version bind rules. We will go on in this manual page with a detailed description of version bind rules divides into the sections

RULE HEAD
Description of the structure of rule heads.
EVALUATION ALGORITHM
The Algorithm how version bind rules are evaluated.
NAME PATTERNS
Name patterns as first predicate in attribute expressions.
PREDICATES
List of valid predicates.
ATTRIBUTES
A List of predefined attribute names and some word about the ordering relationship between attribute values.
EXPANSION
Description of the various types of expansion such as parameter substitution, attribute and macro expansion, and command substitution.
LEXICAL STRUCTURE
Lexical constraints for names and strings in version bind rules.
TIPS, TRICKS, AND TRAPS
Some common problems.
GRAMMAR
A complete grammar for version bind rules.

RULE HEAD

A version bind rule consists of a rule head and a rule body. The example above shows only the rule body. The rule head defines a name for the rule and optionally a parameter list. The name is a string consisting of any printable non-whitespace character except colon and parentheses. It is followed by an optional parameter list in parentheses and a colon, delimiting the rule head. Multiple parameters in the list are separated by comma. Examples are

	most_recently_released:

	from_release (release_name):

	last_released (library_path, include_path):

EVALUATION ALGORITHM

The basic idea of the rule evaluation algorithm is, that in every state of processing a hit set exists, a set of versions reflecting the current rule evaluation result. The hit set is initialized with all versions of the given name at the beginning of each attribute expression. The attribute expressions predicates are processed from left to right in the order they occur. Each predicate imposes requirements to the versions in the hit set and eliminates all versions not fulfilling these requirements. So, the hit set becomes smaller and smaller during attribute expression evaluation. The following figure illustrates this process together with the rule most_recently_released defined above and the file foo existing as busy version and as versions 1.0 through 1.2.

	Initial hit set:	( foo[busy], foo[1.0], foo[1.1], foo[1.2] )

	Evaluate Predicate:	ge (status, saved),

	New hit set:	( foo[1.0], foo[1.1], foo[1.2] )

	Evaluate Predicate:	max (stime);

	Final hit set:	( foo[1.2] )

When the hit set becomes empty, that is when no version meets all the predicates evaluated so far, the attribute expression fails and processing is finished immediately. All remaining predicates will not be evaluated. Even remaining predicates without influence on the hit set (for example message predicates) will not be processed. Processing continues with the next attribute expression. If all attribute expressions finish prematurely, the whole version binding fails. In the following example, the first attribute expression fails and the second alternative leads to success.

	Initial hit set:	( bar[busy] )

	Evaluate Predicate:	ge (status, saved),

	New hit set (empty):	( )

	Evaluate next attribute expression
	starting with initial hit set again:	( bar[busy] )

	Evaluate Predicate:	eq (status, busy);

	Final hit set:	( bar[busy] )

When evaluation reaches the end of an attribute expression without the hit set being empty, two cases are possible. First, the hit set contains exactly one version and everything is fine. This is usually the desired state and rule evaluation returns the remaining version as bind result. Second, the hit set may contain more than one version. In this case, the evaluation algorithm depends on the expected result. When a unique version binding is expected, this is treated as failure and evaluation goes on with the next attribute expression. Non-unique version binding regards this as success and returns the whole hit set.

Extending the hit set during evaluation of an attribute expression is not possible. This would be against the nature of the version bind rules and would make them much more difficult to understand. Hit set extension may only happen by letting the current attribute expression fail and begin with a new one and the maximum hit set.

Failure of an attribute expression must not necessarily be caused by an empty hit set. It may also be caused by user interaction or by external constraints. The following rules exemplify user interaction:

	eq (state, busy), confirm (select busy version ?, y);
	ge (state, busy), max (version).

where the user will be asked for confirmation to select the busy version, and external constraints:

	exists (otto, 1.0), eq (state, busy);
	ge (state, busy), max (version).

where selection of the busy version happens only, when version 1.0 of otto exists (this example is somewhat silly). Predicates like confirm and exists don't care about the hit set. They provide the possibility to impose external control on the evaluation of version bind rules. An attribute expression may be finished prematurely and control switches to the next one.

There is another operator, the cut operator, that forces the whole bind operation to finish (and fail). Typically the cut operator stands at the end of an attribute expression that should never succeed. The following is a typical example for use of this. Version binding fails, if there is an update lock set on the most recent version.

	max (version), hasattr (locker), cut (history is locked !);
	max (version).

The cut operator accepts a string argument that will be written to the standard output.

NAME PATTERNS

Each attribute expression may start with a pattern, against which the name to be bound is matched. Only when the name matches the pattern, the corresponding attribute expression will be evaluated. If not, the attribute expression will be skipped. When the pattern is omitted in the attribute expression (as in our example above), the expression is evaluated for each name.

The patterns are the same as those recognized by sh(1) for filename generation on the command line. Magic cookies are:

CB*
matching any string, including the empty string,
CB?
matching any single character,
CB[c...]
matching any one of the characters enclosed in the square brackets,
CB[l-r]
matching any character lexically between the left (CBl) and the right (CBr) character, inclusive, and
CB[!c...]
CB[!l-r]
matching any character not recognized by their counterparts above.

A rule with name patterns for example looks like:

	xyyz.h,	eq (version, 1.3);
	*.c, 	eq (generation, 2), max (revision);
	*.h, 	eq (generation, 3), max (revision).

In this example, version binding for C source files (most recent version from generation 2) is different from version binding for header files (most recent version from generation 3). Additionally, the name xyyz.h will always be bound to version 1.3.

If the name to be bound is given together with a (absolute or relative) pathname, this will not be cut off. The match is always performed lexically for the whole name given. Hence, the name pattern may also contain path names, like

	variant1/*,	eq (alias, var1-1.4);
	variant2/*, 	eq (alias, var2-1.2);
	/usr/sample/include/*.h,	max (revision).

Usually, the version bind subsystem does not check, if different path prefixes in the name pattern and the given name to be bound lead to the same location. The match is done lexically and must fit exactly. An exception is, when the name pattern is given as network path name as in atnetwork(3). A network pathname consists of the name of the host, controlling the device where a version is stored, the canonical pathname to the version and a version binding (e.g. version number, version alias, or date) either in brackets or separated from the name by an at (@) sign. Examples are

	desaster:/usr/sample/project/foo.c[1.0];
	desaster:/usr/sample/project/variant1/bar.c[var1-1.4];
	desaster:/usr/sample/project/xyyz.h@1.3;
	desaster:/usr/sample/project/bar.c@Fri Jun 18 13:40:58 MET DST 1993.

Network pathnames are mapped to canonical local pathnames before being processes and in this case, the given name to be bound will also be mapped to a canonical local pathname.

The technique using network pathnames is especially useful when storing the result of a successful version selection persistently. This makes the version selection easily reproducible from anywhere in the local areas network. shape(1) uses this mechanism when generating its bound configuration threads.

PREDICATES

This is the complete list of valid predicate names and a synopsis of their arguments. The list is divided into several parts, each describing a certain class of predicates.

The first class are predicates working independently on each element of the current hit set. They impose certain requirements to the attributes of each version and eliminate those, not fulfilling the requirements.

eq (attrName,attrValue)
The named attribute must exist in the versions attribute list and it must have exactly the given value. When the corresponding version attribute has multiple values, at least one of the values must match exactly.
hasattr (attrName)
The named attribute must exist in the versions attribute list. When applied to a standard attribute, hasattr requires a value to be associated with the standard attribute. In case of user defined attributes, the attribute value is not regarded.
ne (attrName,attrValue)
The named attribute, when existing in the versions attribute buffer, must not have the given attribute value. When the attribute does not exist, everything is fine. If the attribute has multiple values, it is required, that none of the values matches the given attrValue.
{ge,gt,le,lt} (attrName,attrValue)
The named version attribute must have a value, that is greater or equal / greater than / less or equal / less than the given attribute value. The named attribute must exist in the versions attribute buffer, otherwise the version is eliminated from the hit set. For attributes with multiple values, only one of the values must meet the required property.

The second class are predicates that do not operate on single version but rather on the complete hit set. They express relations between different versions in the hit set and base their selection on comparison of different versions. Usually, they are used to get a unique version binding, by ordering the hit set and selecting one of the extremes.

min (attrName)
Retain the version(s) with the lowest value for the named attribute in the hit set. String values are compared literally, others "naturally" (see the list of known attributes below for an explanation of that). Versions not carrying the named attribute or having no value associated with the attribute name are eliminated from the hit set.
max (attrName)
Retain the version(s) with the highest value for the named attribute in the hit set. String values are compared literally, others "naturally" (see the list of known attributes below for an explanation of that). Versions not carrying the named attribute or having no value associated with the attribute name are eliminated from the hit set.

The next two predicate groups have no direct influence on the hit set. They can invalidate the hit set and cause the rule evaluation to go on with the next attribute expression, but they do never modify the hit set. These predicates are activated, when the evaluation of the attribute expression reaches them, i.e. when the hit set is not empty.

msg (msgString)
Print the given message to standard output and retain the current hit set.
cut (msgString)
Force the current rule binding to fail and print the given message to standard output. Printing is omitted, when the message string is empty. Rule processing is stopped immediately and the returned hit set is empty.
confirm (msgString,expectedAnswer)
Ask the user for confirmation to go on with the evaluation of the current attribute expression. The given message string is printed to standard output with the expected answer appended in square brackets. After that, user input is read. When the user confirms the expected answer (empty input) or his/her input matches the expected answer, evaluation of the current attribute expression continues. Otherwise, the current hit set is invalidated and processing goes on with the next attribute expression.
bindrule (ruleName)
Abort evaluation of current attribute expression and switch to another version bind rule. This predicate makes only sense as last predicate in an attribute expression, as following predicates will never be evaluated. Evaluation of the target rule (ruleName) happens as if the rule has been invoked directly, no influence on the initial hit set is taken. When the target rule fails, the evaluation algorithm switches back to the source rule and goes on with the next attribute expression.

The last predicate group are external constraints. Their task is to influence the evaluation process by examining conditions outside the handled version history. Each of the following predicates has either a positive or a negative result. Positive results have no effect on the hit set and the evaluation process, while negative results invalidate the hit set and cause evaluation to go on with the next attribute expression.

exists (name[binding])
Version binding with the given name (usually another one than the current target name) and the given version binding must lead to at least one version. The result is not required to be unique.
existsnot (name[binding])
Version binding with the given name and rule must fail.
existsuniq (name[binding])
Version binding with the given name and rule must lead to a unique selection.
condexpr (program,expression)
An external program, named in the program argument, is activated to evaluate the given expression. The expression string is written to the standard input of the external program. A zero result code is considered to be positive, all others negative.

OBSOLETE PREDICATE NAMES

There are a number of known predicate names from former versions of the bind rules machinery. They are internally mapped to the new predicate names. These names are obsolete and should not be used any longer.

Obsolete name mapped to - cut attr eq attrex hasattr attrge ge attrgt gt attrle le attrlt lt attrmax max attrmin min attrnot ne condex exists condnot existsnot conduniq existsuniq

ATTRIBUTES

All predicates with effect on the contents of the hit set work on version attributes. These attributes are either standard attributes with a defined meaning or user defined attributes. The following is a list of attribute names recognized as standard attributes. All other names are considered to be user defined attributes.

alias
Version alias name (symbolic version identification name).
atime
The date of last access (read or write) to the versions contents.
author
The version author in the form username@domain.
cachekey
A unique key for cached versions built from the creation date, the id of the creating process and a serial number (e.g. 740148430.18469.6).
ctime
The date of the last status change. This date is updated, when an attribute is added or deleted, or an attribute value is changed.
generation
The generation number. The value for this attribute is plain numeric.
host
The name of the host from where the version was accessed. This attribute may have different values at one time, when the version is accessed from different hosts.
locker
The user who has set a lock on the concerned version. This attribute has an empty value, when no lock is active. The attribute value is given in the form username@domain.
ltime
The date of last lock change (set or give up update lock). This has an empty value is empty, when there was never a lock set on the version.
mtime
The date of the last modification of the versions contents.
name
The name (without suffix) of the version. For example foo for foo.c.
owner
The version owner in the form username@domain.
revision
The revision number. As for generation, only numeric values are accepted.
size
The size of the versions contents in bytes.
status
The version status. This is one of busy, saved, proposed, published, accessed, or frozen.
stime
The save date. This attribute has an empty value for busy versions.
syspath
The absolute pathname through which the version was accessed. This attribute may have different values at one time, when the version is accessed by different pathnames (e.g. symbolic links).
type
The suffix part of the versions name. For example fc for foo.c.
version
The version number in the form generation.revision. A special value is the string busy instead of a version number. As busy versions have no version number, this value is used for identifying the busy version of a history.

Some predicates (like ge or max) require an ordering relationship between attribute values. For user defined attributes, ordering bases on alphabetical (ASCII) comparison of the values. User defined attributes with multiple values are compared by their first values, if these are identical by their second values and so on. No value is considered smaller than any other value. For example

	attr1 = anton		attr2 = berta
	        berta	is smaller than	        anton
	        karl

but

	attr1 = anton		attr2 = anton
	        berta	is bigger than	        berta
	        karl
For some of the standard attributes listed above, we need special
regulations.
Version numbers (generation.revision)
are ordered by generation number first and revision number secondary (e.g. 1.2 is smaller than 2.1). Busy is smaller than any version number.
Alias Names
are ordered by the version numbers (see above) of the identified versions.
Cache keys
are ordered by simple text comparison. This has the effect that the youngest cache key is considered the biggest.
Version states
are ordered in the sequence as listed above. Busy is the lowest and frozen the highest state.
User attributes
The order of user attributes bases on alphabetical comparison of the string username@domain.
Time attributes
Time comparison assumes older dates to be smaller than newer ones.

EXPANSION

During evaluation of version bind rules, four different kinds of expansion are possible. These are parameter substitution, attribute expansion, external macro expansion and command substitution. Expansion happens, when a magic pattern is found in the rule text, starting with either a dollar sign ($) or, in case of command substitution, with a backward quote character (`).

Generally, expansion in version bind rules happens only within patterns and within predicate arguments. Bind rule syntax or predicate names cannot be introduced by substituted strings. Expansions outside patterns and predicate arguments are ignored and usually lead to an error message.

Parameter Substitution

A parameter substitution string is usually introduced by the pattern $_ followed by the parameter name (an exception is $+ as shown below). The parameter name is optionally delimited by a dollar sign. This is necessary, when there is no whitespace character following. The parameter name may be any of the names specified in the rule head or one of the following predefined names.

$_rule$
The current rule name.
$_target$ or $+
The current target file name to be bound.
$_parameter$
Any other parameter.

A parameter may have the same name as a citeable attribute (see below). In this case, the parameter citation hides the attribute citation. There is no way to cite the value of an attribute when there is an equally named rule parameter. The reserved names rule, target, and hits are not allowed as parameter names.

Attribute Expansion

An attribute expansion string looks exactly like a parameter substitution string. It is introduced by the pattern $_ followed by the attribute name which is optionally delimited by a dollar sign, when a non-whitespace character follows immediately. Attribute names may be built of any printable characters except '#'. Besides, it makes no sense to cite attributes with an equal sign ('=') in the attribute name, as the Attributed Filesystem (AtFS) doesn't allow this.

The value by which the attribute expansion string will be replaced depends on the current state of processing. This may cause different values to be inserted for the same citation in different processing states. Attribute expansion happens as late as possible, it is done right before the evaluation of the concerned pattern or predicate. With one exception, $_hits$, attribute expansions will only be substituted, if the current hit set cardinality is 1.

$_hits$ or $=
The number of versions satisfying the binding conditions expressed so far (the cardinality of the hit set). This value continuously changes during rule evaluation.
$_attribute$
The value of any attribute of a uniquely selected version.

Attribute citations may be overloaded by parameter citations (see above).

External Macro Expansion

External macros are evaluated by an external macro processor. If no such macro processor is available, external macros remain unchanged. They have the form:

$C
where C is any printable non-whitespace character except '+', '=', '_', ':', or '#'
$(macroName) or ${macroName}
Macro names may not contain '#' characters. Other limitations may be imposed by the external macro definition and processing facility.

Command Substitution

A command enclosed in back quotes occuring in a bind rule quotes will be replaced by its output. No modifications are done to the command output, hence it may contain newline characters.

LEXICAL STRUCTURE

There are some characters with special meaning when occurring in version bind rules. These are the syntactical characters colon (:), comma, (,), semicolon (;), period (.), and parentheses (( and )), the comment symbol (#), the dollar sign ($) or the back quote (`) introducing expansion strings ($), quotes (" and '), and the escape symbol (\).

Comments are handled somewhat rigorously. A comment symbol (#) occurring anywhere in the rule name or rule body has effect as long as it is not escaped by a backslash (\) character. Comments range from the comment symbol (inclusive) to the end of the line. Newline characters within comments may also be escaped by a backslash character, continuing the comment on the next line.

Nesting of parentheses and quotes is not supported.

The following is a list of lexical constraints for each part of a version bind rule.

Rule names and rule parameters
Rule names may consist of any printable non-whitespace character except colon and parentheses. The leftmost colon or opening parentheses delimits the rule name.

Rule parameter names follow the same lexical rule, but additionally must not contain comma characters, as this in the delimiter between parameters.
Patterns
In principle, name patterns may consist of any printable character. Comma and semicolon characters occurring in a name pattern must be escaped by a backslash character. A period occurring in a name pattern needs not to be escaped as long as it is not the last character (ignoring trailing whitespace) in the rule body. Period as last character is always considered to be end of rule sign. Name patterns may contain macro or parameter citations and command substitutions.
Predicates
Each predicate name must be one of the reserved names listed previously in this paper. Predicate arguments consist of any printable character including whitespace. Comma, parenthesis or quoting characters must be escaped. Any argument may be quoted by single or double quotes. Quoting exceeds line limits.

Predicate arguments may contain macro, attribute or parameter citations leaded by a dollar sign, or command substitutions enclosed in back quotes. When quoted in single quotes, dollar signs and back quotes occurring in a predicate argument are not taken by their special meaning and no citations happen. Double quotes do not hide citations.

TIPS, TRICKS, AND TRAPS

Why doesn't the bind rule select version xyz although I think it should ?. An important facility to find an answer to this question is the trace option provided by the vbind(1) command. It shows the evolution of the hit set during rule evaluation.

Typing errors in standard attribute names may lead to confusing situations. They cannot be recognized by the evaluation machinery, as any unknown attribute name is considered to be an user defined attribute.

A minus sign (-) as first character in an alternative is considered as part of the pattern and not as (old style) cut operator. Hence

-; (- as pattern)

and

,-; (default pattern followed by cut)

make a big difference. We recommend the use of cut() in any case. The short form (-) is supported only for compatibility with older versions.

GRAMMAR

bind_rule ::= bind_rule_head CB:[CB-] bind_rule_body .

bind_rule_head ::= rule_name | rule_name CB( rule_arg_list CB) .

rule_arg_list: rule_name { CB, arg_name }* .

bind_rule_body ::= attr_expression { CB; attr_expression}* CB. .

attr_expression ::= name_pattern { CB, predicate }* | predicate { CB, predicate }* | .

name_pattern ::= { <any printable character or whitespace> }+

predicate ::= attr_value_predicate CB( attr_name CB, string CB) | attr_name_predicate CB( attr_name CB) | bind_rule_predicate CB( rule_name CB) | msg_predicate CB( string CB) | msg_answer_predicate CB( string CB, string CB) | cond_rule_predicate CB( string CB, bind_rule_head CB) | cond_expr_predicate CB( string CB, string CB) | cut_predicate.

attr_value_predicate ::= eq | ge | gt | le | lt | ne .

attr_name_predicate ::= hasattr | max | min .

bind_rule_predicate ::= bindrule .

msg_predicate ::= cut | msg .

msg_answer_predicate ::= confirm .

cond_rule_predicate ::= exists | existsnot | existsuniq .

cond_expr_predicate ::= condexpr .

cut_predicate ::= CB- .

attr_name ::= arg_name | author | atime | ctime | generation | locker | ltime | mtime | owner | revision | size | state | stime | version .

rule_name ::= { <any printable character except colon and parentheses }+ .

arg_name ::= { <any printable character except comma, colon and parentheses }+ .

string ::= { <any printable character or whitespace> } .

FILES

$SHAPETOOLS/BindRules

SEE ALSO

AUTHOR

Andreas.Lampen@cs.tu-berlin.de