man zoem (Commandes) - macro processor for the Zoem macro/programming language.

NAME

zoem - macro processor for the Zoem macro/programming language.

SYNOPSIS

zoem [-i <file name>[.azm] (entry file name)] [-I <file name> (entry file name)] [-o <file name> (output file name)] [-d <device> (set device key)]

zoem

(enter interactive mode - happens when none of -i, -I, -o is given)

zoem -i <file name>[.azm] (entry file name) -I <file name> (entry file name) [-o <file name> (output file name)] [-d <device> (set device key)] [-x (enter interactive mode on error)] [-s <key>=<val> (set key to val)] [-e <any> (evaluate any, exit)] [-E <any> (evaluate any, proceed)] [-chunk <num> (process chunks of size num)] [--trace (trace mode, default)] [--trace-all-long (long trace mode)] [--trace-all-short (short trace mode)] [--trace-regex (trace regexes)] [--show-tracebits (show trace bits)] [-trace k (trace mode, explicit)] [--stats (show symbol table stats after run)] [--split (assume \writeto usage, set \__split__)] [--stress-write (make \write#3 recover)] [--unsafe (prompt for \system#3)] [--unsafe-silent (simply allow \system#3)] [-allow cmd1[:cmdx]+ (allowable commands)] [--system-honor (require \system#3 to succeed)] [-nuser k (user dict stack size)] [-ndollar k (dollar dict stack size)] [-nsegment k (maximum simple nesting depth)] [-nstack k (maximum eval nesting depth)] [-buser (initial user dict capacity)] [-bzoem (initial zoem dict capacity)] [-tl k (tab length)] [-l <str> (list items)] [-h (show options)] [--apropos (show options)]

DESCRIPTION

Zoem is a macro/programming language. It is fully described in the Zoem User Manual (zum.html), currently available in HTML only. This manual page documents the zoem processor, not the zoem language.

If the input file is specified using the -i option and is a regular file (i.e. not STDIN - which is specified by using a single hyphen), it must have the extension .azm. This extension can but need not be specified. The zoem key \__fnbase__ will be set to the file base name stripped of the .azm extension and any leading path components. If the input file is specified using the -I option, no extension is assumed, and \__fnbase__ is set to the file base name, period. The file base name is the file name with any leading path components stripped away.

If neither -i nor -o is specified, zoem enters interactive mode. Zoem should fully recover from any error it encounters in the input. If you find an exception to this rule, consider filing a bug report. In interactive mode, zoem start interpreting once it encounters a line containing a single dot. Zoem's input behaviour can be modified by setting the key \__parmode__. See the section SESSION MACROS for the details. In interactive mode, zoem does not preprocess the interactive input, implying that it does not accept inline files and it does not recognize comments. Both types of sequence will generate syntax errors.

From within the entry file and included files it is possible to open and write to arbitrary files using the \write#3 primitive. Arbitrary files can be read in various modes using the \dofile#2 macro (providing four different modes with respect to file existence and output), \finsert#1, and \zinsert#1. Zoem will write the default output to a single file, the name of which is either specified by the -o option, or constructed as described below. Zoem can split the default output among multiple files. This is governed from within the input files by issuing \writeto#1 calls. Refer to the --split option and the Zoem User Manual.

If none of -i or -o is given, then zoem will enter interactive mode. In this mode, zoem interprets by default chunks of text that are ended by a single dot on a line of its own. This can be useful for testing or debugging. In interactive mode, zoem should recover from any failure it encounters. Interactive mode can also be accessed from within a file by issuing \zinsert{stdia}, and it can be triggered as the mode to enter should an error occur (by adding the -x option to the command line).

If -o is given and -i is not, zoem reads input from STDIN.

If -i is given and -o is not, zoem will construct an output file name as follows. If the -d option was used with argument <dev>, zoem will write to the file which results from expanding \__fnbase__.<dev>. Otherwise, zoem writes to (the expansion of) \__fnbase__.ozm.

For -i and -o, the argument - is interpreted as respectively stdin and stdout.

OPTIONS



Specify the entry file name. The file must have the .azm extension, but it need not be specified.



Specify the entry file name, without restrictions on the file name.



Specify the output file name.



Set the key \__device__ to <device>.



The afterlife option. If zoem encounters an error during regular processing, it will emit error messages as usual, and then enter interactive mode. This allows you e.g. to inspect the values of keys used or defined within the problematic area.



Set the key \key to val. Any type of key can be set, including keys taking arguments and keys surrounded in quotes. Beware of the shell's quote and backslash interpolation.



This causes zoem to evaluates <any>, write any result text to stdout, and exit.



This causes zoem to evaluates <any>, write any result text to stdout, and proceed e.g. with the entry file or an interactive session.



Zoem reads its input in chunks. It fully processes a chunk before moving on with the next one. The -chunk <num> option defines the (minimum) size of the chunks. The size or count of the chunks does not at all affect zoem's output.

Zoem will read files in their entirety before further processsing if -chunk 0 is specified.

Zoem does not chunk input files arbitrarily. It will append to a chunk until it is in the outermost scope (not contained within any block) and the chunk will end with a line that was fully read.

Consequently, if e.g. a file contains a block (delimited by balanced curlies) spanning the entire file then zoem is forced to read it in its entirety.



Trace in default mode.



Sets on most trace options in long mode. Trace options xxx not set have their own --trace-xxx entry (see below).



Sets on most trace options in short mode. Trace options xxx not set have their own --trace-xxx entry (see below).



Trace keys.



Trace regexes (i.e. the \inspect#4 primitive).



Set trace options by adding their representing bits.



This makes \write#3 recover from errors. It is a special purpose option used for creating zoem stress test suites, such as stress.azm in the zoem distribution /examples subdirectory.



With --unsafe system calls are allowed but the user is prompted for each invocation. The command and its arguments (if any) are shown, but the STDIN information (if any) is withheld. With --unsafe-silent system calls are allowed and the user is never prompted.

Use -allow str or --allow=str to specify a list of allowable commands, as a string in which the commands are separated by colons.



With this option any \system#3 failure (for whatever reason, including safe behaviour) is regarded as a zoem failure. By default, failing system calls are ignored under either safe mode, unsafe mode (--unsafe), or silent unsafe mode (--unsafe-silent).



This assumes zoem input that allows output to multiple files (e.g. chapters). It sets the default output stream to stdout (anticipating custom output redirection with \writeto#1) and sets the session macro \__split__ to 1.



Show symbol table chacteristics. Symbol tables are maintained as hashes.



Set the tab length. HTML output can be indented according to nesting structure, using tabs which are expanded to simple spaces. By default, the tab length is zero, meaning that no indent is shown. The maximum value the tab length can be set to is four.



Probably needed only if you have some obscure and extreme use for zoem. The segment limit applies to simple nesting of macros. The stack limit applies to nesting of macros that evaluate an argument before use. Each such evaluation creates a new stack. The user limit applies to \push{user}, the dollar limit applies to \push{dollar}. The user dict capacity pertains to the initial number of buckets allocated for user and dollar dictionaries, and the zoem dict capacity pertains to the dictionary containing the zoem primitives.



List items identified by <str>. It can be any of all, filter. legend, macro, session, trace, or zoem, Multiple identifiers can be joined in a string, e.g. -l legendzoem prints a legend followed by a listing of zoem primitives.



Show short synopsis of options.



Show one-line synopsis of all options.

SESSION MACROS



This macro affects zoem's read behaviour in interactive mode. It can be set on the command line using the -s option. Bits that can be set:

1    chomp newlines (remove the newline character)
2    skip empty newlines
4    read paragraphs (an empty line triggers input read)
8    newlines can be escaped using a backslash
16   read large paragraphs (a single dot on a line
     triggers input read)



The current output device, set by the command line option -d. The man and faq packages support html and roff as its values.



The base name of the input file name. Leading path components are stripped away. If the -i option is used the input file is required to have the .azm suffix. In that case the suffix is also stripped to obtain the base name.



The leading component of the input file name, possibly empty.



The name of the entry file.



The file currently being processed.



The line number in the file currently being processed.



The name of the default output file.



Set to one of ok, towel (that one is a bit lame), or error by either the interpreter, an occurrence of \catch#2, or \try#1.



Set by \try#1 to the possibly truncated result of processing its argument.



A vararg containing a list of paths to search when a file is to be included/imported/read/loaded. When you start zoem, this key should contain the location of the man.zmm and faq.zmm package files. It is advisable not to overwrite this key but to append to it instead.



Expands to a left curly. It is hard to find a need for this - the zoem stress suite uses it to generate a particular syntax error at a deeper interpretation level.



Expands to a right curly.

THE INSPECT SUBLANGUAGE

The \inspect#4 primitive takes four arguments. The languages accepted by the first two arguments are described below. The third argument is a replacement string or a replacement macro accepting back-references (supplied as an anonymous macro). The fourth argument is the data to be processed.

arg 1

Is a vararg. Currently it accepts a single key mods for which the value should be a comma-separated list over the words posix, icase, dotall, iter-lines iter-args, match-once, discard-nmp, discard-nil-out, discard-miss, count-matches. Alternatively repeated use of mods is allowed.

arg 2

Is a regular expression. Tilde patterns are expanded according to all of the ZOEM, UNIX, and REGEX schemes. Refer to TILDE EXPANSION for these.

The third argument is a constant string or an anonymous key, the fourth argument is data.

THE FORMAT SUBLANGUAGE

\format#2 has two arguments, respectively fmt and vararg.

The format string fmt may contain normal characters that will be output, and meta sequences. A meta sequence is started with the percent character %, and ended with the first dot in the same scope as the percent sign. For each arg in vararg, the next applicable meta sequence is sought in fmt. A meta sequence consisting of two consecutive percent signs (%%) will be skipped and result in the output of a single percent sign.

Otherwise, the meta sequence may consists of a number of subsequences that may occur in any order. These are



i.e. a tilde followed by two blocks. The content of the first block is used as the padding string. By default spaces are used for padding. Padding is applied if the width of the field exceeds the width of arg plus the width of the optional delimiter(s) encoded in the second block. The content of the second block is inserted inbetween the padding and arg. If centered alignment is used, this will be done on both sides. By default no such delimiter is used.



These respectively denote left, centered, and right alignment.



A positive integer number denoting the desired width of the field on which arg is to be printed.



i.e. an asterisk sign followed by two blocks. This is for when you want to use a virtual length. The second block should contain a string. The length of this string will be used to do the computations, but the actual string that is inserted into the formatted result is taken from vararg. This allows you for example to do alignment in <pre> formatted blocks in html, while keeping the possibility to insert elements that do not take up any width (e.g. links). The first block should contain the name of a macro, which is used to compute the length of the string in the second block. A likely candidate is \length#1, which should be specified simply as length.



i.e. an at sign followed by two blocks. This is used to specify alignment. The content of the first block is used as the string on which to align. The content of the second block is used as the width on which the first part of arg (up to and including the alignment substring) will be right-aligned. If the asterisk sequence is used, the length macro specified therein will be used to compute the width of the string specified in the first block.

TILDE EXPANSION

Some primitives interface with UNIX libraries that require backslash escape sequences to encode certain tokens or characters. The backslash is special in zoem too and without further measures it can become very cumbersome to encode the correct escape sequences as it is not always clear which tokens should be escaped or unprotected at what point. It is especially difficult to handle the zoem characters with special meaning, {, } and \.

The two primitives under consideration are are \inspect#4 and \tr#2. Both treat the tilde as an additional escape character for certain arguments (as documented in the user manual). These arguments are subjected to tilde expansion, where the tilde and the character it proceeds are translated to a new character or character sequence. There are three different sets of tilde escapes, ZOEM, UNIX and REGEX escapes. \tr#2 only accepts UNIX escapes, \inspect#4 accepts all. Tilde expansion is always the last processing step before strings are passed on to external libraries.

The ZOEM scheme contains some convenience escapes, such as \E to encode a double backslash.

ZOEM tilde expansion

 meta sequence   replacement
.-----------------------------.
|     ~~       |      ~       |
|     ~E       |      \\      |
|     ~e       |      \       |
|     ~I       |      \{      |
|     ~J       |      \}      |
|     ~x       |      \x      |
|     ~i       |      {       |
|     ~j       |      }       |
`-----------------------------'

The zoem tr specification language accepts \x** as hexadecimal notation, e.g. \x0a denotes a newline in the ASCII character set.

UNIX tilde expansion

 meta sequence   replacement
.-----------------------------.
|     ~a       |      \a      |
|     ~b       |      \b      |
|     ~f       |      \f      |
|     ~n       |      \n      |
|     ~r       |      \r      |
|     ~t       |      \t      |
|     ~v       |      \v      |
|     ~0       |      \0      |
|     ~1       |      \1      |
|     ~2       |      \2      |
|     ~3       |      \3      |
`-----------------------------'

REGEX tilde expansion

 meta sequence   replacement
.-----------------------------.
|     ~^       |      \^      |
|     ~.       |      \.      |
|     ~[       |      \[      |
|     ~$       |      \$      |
|     ~(       |      \(      |
|     ~)       |      \)      |
|     ~|       |      \|      |
|     ~*       |      \*      |
|     ~+       |      \+      |
|     ~?       |      \?      |
`-----------------------------'

ENVIRONMENT

The environment variable ZOEMSEARCHPATH may contain a colon and/or whitespace separated list of paths. It will be used when searching for files included via one of the dofile aliases \input, \import, \read, and \load. Note that the zoem macro \__searchpath__ contains the location where the zoem macro files were copied at the time of installation of zoem.

DIAGNOSTICS

On error, Zoem prints a file name and a line number to which it was able to trace the error. This will usually be correct, but the error may have occurred in a macro deeply nested in a macro found in the line reported by zoem. If in despair, use one of the tracing modes, --trace-keys is one of the first to come to mind. Another possibility is to supply the -x option.

BUGS

No known bugs. \inspect#4 has not received thorough stress-testing, and the more esoteric parts of its interface will probably change.

SEE ALSO

Portable Unix Documentation provides two mini-languages for authoring in the unix environment. These languages, pud-man and pud-faq are both written in zoem.

AUTHOR

Stijn van Dongen.