man cmix (Commandes) - The C-Mix/II program specializer
NAME
cmix - The C-Mix/II program specializer
SYNOPSIS
cmix [option...] [filename...]
DESCRIPTION
C-Mix/II is a program specializer for the C programming language. cmix reads one more C source files and some specializer directives and processes them to produce the C source for a generating extension program which, when compiled and run, produces specialized versions of the original C source files.
cmix also produces an annotated program listing which is the contents of the original source files decorated with binding-time information. cmixshow(1) can be used to display the annotated program listing in various ways.
For more background information, refer to the full User Manual.
OPTIONS
specifies where to look for #included files. Equivalent to the -I option usually found in C compilers. defines the given symbol as a preprocessor macro when reading in the C sources. Equivalent to the define: specializer directive and to the -D option usually found in C compilers. give an arbitrary specializer directivec . Note that the shell will probably need the directive text to be quoted except in very simple cases. controls the naming of the C-Mix/II output files, which will be called basename-gen.c and basname.ann. Equivalent to the outputbase: specializer directive. suppresses routine messages about reading and writing files. be verbose about the progress of analysis phases. output the .ann file in a format that resembles the data format uses internally by C-Mix/II. The format is sufficiently close to C to be readable, but you may have trouble recognizing the structure of your program-or indeed to find your program amidst the definitions read from header files. same as -s but turns off some shortcuts normally taken to keep the output readable. This makes the output quite unreadable indeed: what normally is represented as t2p->t1.x now gets rendered *&(&(*&t2p)->t1)->x. This is primaily useful when debugging C-Mix/II. make a generating extension that tries very hard to share code internally in each residual function. This slows down the specialization process very noticeably, usually with no great benefit. instead of parsing the C source and making a generating extension, just preprocess it (with the same include file search path and predefined symbols as C-Mix/II would normally use) and write the preprocessed source to the standard output. This is sometimes useful for debugging if one suspects the program text is changed unexpectedly in the preprocessing step. displays an option summary controls debugging output, which you normally don't want to see.
OTHER ARGUMENTS
Any command line argument to C-Mix/II that is not an option is treated as a file name. If the argument ends with .c then it is assumed to be a C source file containing code to be specialized. If you want a file whose name does not have a .c suffix to be parsed as C code, you need to use an explicit source: directive. E.g.: If a command line argument is not an option and does not look like a C source file name, it is interpreted as the name of a script file. The script file consists of specializer directives separated by semicolons or blank lines. Since most of C-Mix/II's actions can be controlled by specializer directives, once you have set up a script file, a useful C-Mix/II command line can be as short as
OUTPUT FILES
C-Mix/II names its output files by appending -gen.c and .ann to a basename obtained in the following way: An -o option, if it exists, always takes precedence If no -o option is found C-Mix/II looks for an outputbase: specializer directive. The name of the first script file named on the command line. By convention, script files have names ending in .cmx; this suffix is stripped from the file name before using it to derive output file names. If there is no script file, the name of the first C source file named on the command line is used. If everything else fails, the output files will be names cmix-gen.c and cmix.ann.
SPECIALIZER DIRECTIVES
Specializer directives provide overall control on what C-Mix/II does. They define which C files are read, which function is to be specialized with which parameter binding times, etc. User annotations of the subject program are also given as specializer directives. There are several ways to supply C-Mix/II with specializer directives: By -e switches on the command line. Implicitly on the command line: A command line argument that looks like a filename and ends in .c is converted to the specializer directive In a separate script file. By convention script files have a .cmx suffix and the same name as the generating extension that should be produced. Any command line argument to C-Mix/II that does not have any other meaning is interpreted as a script file name. The contents of the file is parsed as specializer directives, separated by semicolons or blank lines. As "pragma"s in the C sources read by C-Mix/II. If any line in the C input begins with the rest of that line is parsed as specializer directive(s). The source: and define: directives cannot be given as pragmas. It varies from system to system whether preprocessor macros are expanded on #pragma lines. The differences do not come from C-Mix/II but from the external preprocessor that was found when C-Mix/II was installed. The only directives that need to be present are a source: and a goal: or generator: directive. You use this directive to specify which form of residual functions you want the generating extension to output. fun is the function in the subject program you want to create specialized versions of. In addition to its name you also need to specify the binding time of its parameters. For each parameter you give one of these bt specifications: meaning that the parameter is residual. The residual function will have one parameter for each ? in the directive; their order will be the same as they have in the subject program. meaning that the parameter's value will be given at spectime as command line argument n to the generating extension. When you use the goal: directive to specify the goal function, the generating extension produced by C-Mix/II will be (the C source for) a stand-alone program. The generating extension reads the "$n" parameters from its command line (if any "$n" specifier does not occur in the parameter specification-e.g., specializes foo(?,$2)-the command line parameters with "unused" numbers will simply be ignored). It then generates a specialized function definition and writes it to the standard output stream. The specialized function will have the same name as the original one, but it will have fewer parameters, simply leaving the spectime ones out of the parameter list. The name-spec, if present, selects the name for the specialized function. The most general form is akin to a printf(3) parameter list: uses the second and third argument to the generating extension to create a name for the specialized function. The name-spec can also simply be a quoted string or an identifier. This is the goal: directive's older brother, used when you need more sophisticated control of the generating extension's operation. When you use the generator: directive, the generating extension does not include a main() function-you must supply a main program yourself. Your main program can obtain the spectime data in any way it wants to. When it is ready, it initializes the specialization module by calling and then calls the generator entry point gfun with the spectime arguments. Finally it calls with an argument of type FILE*, which outputs the generated function definitions to the specified file. Before the call to cmixGenExit you can set the global int variable cmixRestruct (which is exported from libcmix.a) to 0, which turns off a restructuring phase that tries to express the residual program with structured control-flow constructions instead of the simpe gotos C-Mix/II works with internally. The parameters to the generator entry point naturally corresponds to "$n" specifications in the directive just as do command line arguments when you use the goal: directive. But the restrictions on their possible types are much lessened; your main program can pass the generating extension structs and pointers and other types that do not have a straightforward ASCII syntax. You can call the generator entry point multiple times between the calls to the cmixGenInit and cmixGenExit functions. That way you can, in a single run, generate multiple specialized versions of the same function, with the potential of sharing residual versions of helper functions. Or you can even generate specialized versions of several different functions in a single run. Just pick different entry point names and supply more than one generator: directive to C-Mix/II. This directive names the C source files that contain the subject program. More than one file name can be given, possibly in different directives. All of the specified files are treated as a single program to be specialized, but C's scope and linkage rules are observed (i.e. an identifier declared static in one file will be invisible in every other file). It is assumed that variables and functions defined with external linkage are shared between the files but not used outside the set of files that C-Mix/II sees. Define the given symbol(s) in each C source file read. In addition to defining the symbols specified in specializer directives, C-Mix/II passes any -Dwhatever options from its command line on to the C preprocessor. C-Mix/II always defines the symbol __CMIX. You can use #ifdef __CMIX etc. to make the C source that C-Mix/II sees differ from the source that your C compiler sees if you compile your program without specializing it. C-Mix/II always defines the symbol __STDC__ which indicates that it tries to be a standards-conformant C implementation. This directive can be used to specify where C-Mix/II puts its output files. See the -o command line option. This is a magic spell (must be typed exactly as given here) that changes the way C-Mix/II writes values of type unsigned char as constants in the residual program when they result from spectime computations. Normally C-Mix/II assumes that when the programmer specifies unsigned char he really means "very short int"; thus they are written to the residual program as integral constants. Sometimes, however, people use unsigned char for storing characters because they want to be 8-bit clean and the plain char type in their compiler is signed. With that kind of programs, this directive can make the residual programs more readable. This magic spell causes the generating extension to use more significant digits when writing intermediate results of type long double (or abstract numeric or floating-point type) into the text of the residual program. Normally such values get truncated to double because the printf routine used by C-Mix/II cannot handle long double correctly on some systems.
USER ANNOTATIONS
User annotations are specializer directives that affect the binding times of the values and actions in the program. There is one set of user annotations for variables and another set of annotations for calls to external functions. An external function can itself be annotated, which applies to all calls of it that have no explicit annotation.
User annotations for variables
Use this for variables that you want to exist only at spectime. This doesn't actually change the binding-time analyses, but if C-Mix/II finds any reason to want the variable to be residual, it will not produce a generator but instead give an error message and terminate. This annotation can not be used for variables that have only extern declarations in the C-Mix/II input files. Prevents an external variable with no definition from being residualized. This can sometimes help making the specialization process use less resources, but also has a real potential for making the residual program incorrect, because C-Mix/II cannot track changes to the value of the variable without having control over its definition. Use this to tell C-Mix/II to assign binding-times so that the variable becomes completely residual. This normally forces other variables and values to be residual as well; that happens automatically and you need not provide user annotations for those. This has the same effects as spectime and additionally specifies that the variable should have external linkage in the generating extension. Namely, the variable will be "visible" to other program modules. This has the same effects as residual and additionally specifies that the variable should have external linkage in the residual program. The two visible ... annotations can only be applied to variables that have external linkage, and a definition (i.e. a file-level declaration without extern, but possibly with an initializer) in the original program. These are the only variables where visibility is a question at all. Objects with external linkage but without a definition are always "visible"; if you do not annotate them with dangerous spectime, C-Mix/II will default to residual. All other variables without user annotations will have a binding time selected by C-Mix/II and will be internal to the generated source files. You apply a user annotation to a variable by writing a specializer directive reading e.g. For global variables-whether their linkage is internal or external-simply name them in the specializer directive. Local variables (including static ones) and function parameters use the two-part syntax (think C++)
User annotations about side-effects from external functions:
The external function does not commit any side effects at all. Its return value does not depend on anything but the argument values (and, if one of the arguments is a pointer, the data that pointer point to, and so on recursively). The external function may commit some side effects, but they only depend on the argument values (and the objects pointed to by the argument values, and so on). In particular, the only objects that may have their values changed by the call are those pointed to (perhaps indirectly) by the argument-the function does not have any internal "state". The external function may still not change any objects not reachable from the arguments, but the values that are written into them and returned from the function may depend on some kind of external state. The function can be expected to do anyting a C function may legally do, including changing the values of any objects that it knows the address of, and including changing an external state which affects later calls to rostate or rwstate functions. This is assumed as the default by C-Mix/II when no explicit annotation is given. C-Mix/II only allows rwstate calls to happen at spectime if it can prove that they will be performed in precisely the same sequence as if the source program was run unspecialized. If this is impossible, and an error message will result.
User annotations about when external functions are called
This annotation specifies that all calls of the function should happen at spectime. If C-Mix/II cannot make sure that all parameters in a call are known at spectime, it aborts with an error message which (hopefully) helps you understand why it is so. This annotation is intended to be placed on e.g. functions that read in parts of the spectime input that are not given as arguments to the goal function. This annotation specifies that the function may only be called in the residual program. This annotation is the default and specifies that the function may be called either at spectime or at residual time, depending on how soon all of the arguments are available. However, if the function is also annotated rwstate or rostate, it will only be called at spectime if that is explicitly requested. (That is, for such function anytime is really equivalent to residual). You annotate a call to an external function by writing the annotation name as a parameter to the predefined __CMIX macro before the function name, such as You can give a state-annotation or a calling-time annotation or (as in the example) both. A default setting for all calls to a given external function can be specified by writing a specializer directive of the form
DIRECTIVES FOR ABSTRACT HEADERS
These specializer directives are used to implement abstract headers; see the full User Manual for a more elaborate discussion of their use. This directive causes a line to be emitted to the beginning of the generating extension source and the residual program source. This directive prevents C-Mix/II from emitting declarations for external variables or functions named with the given identifiers when it emits the generating extension and the residual program source. The rest of the code is still generated as though a declaration was present, so you'll have to make sure by some other means you won't run into trouble with undeclared identifiers. Note that function names do not take a () here as the do in user annotation directives. This directive instructs C-Mix/II to avoid the given identifiers when it chooses names for local functions and variables in the generating extension and residual program. The names of external functions and variables and visible variables are not affected by the taboo: directive. They come out exactly as they appear in C-Mix/II's input, and if that creates problems, you get to sort out some interesting error messages from the C compiler. The same holds for generator: entry points. Whenever identifier appears in the C source it is treated as a literal constant just like, e.g., "42", and copied verbatim to the generating extension or the residual program (the binding-time analysis decides which). It is assumed that a header will be included (with the header: directive) that makes the precense of the identifer there meaningful. identifier must be declared somewhere in the program as an extern and const variable-this declaration supplies the type of the constant expression and must be in scope whenever it is usedc . The declaration is automatically supressed in the output, and the identifier is implicitly registered as a taboo.
ABSTRACT TYPES
C-Mix/II has a special syntax for defining abstract types: which declares baz as an abstract type with the b, a, and r properties. The type so defined is treated very like a predefined type such as signed char. Syntactically the abstract type is a "typedef-name". Zero or more of the following property characters can be specified between the parentheses: The type is "arithmetic": the type checker will allow values of this type as operands to arithmetic operators such as + and /, and conversions to and from other arithmetic types. The type is "integral": the type checker allows the same operations as for arithmetic types, in addition to integer-only operations such as % and <<. The type is "unsigned": This is only meaningful if the i property is given. If lifting of the type is required, the value to be lifted will be converted to an unsigned long and printed to the residual program as such. The type is "signed". This counterpart to u is currently the default for "integral" abstract types with no signedness attribute. When lifting the type it gets printed as a long. The type is a "pointer type": the type checker will allow conversions to and from other pointer types. Unrecognized properties such as b and r are silently ignored.
AUTHOR
C-Mix/II was developed at the University of Copenhagen.
The development team can be reached at
cmix@diku.dk.
BUGS
Probably numerous. Please file as detailed bug reports as you can to cmix-bugreport@diku.dk.
FILES
Used by C-Mix/II to preprocess C input files. The generating extension must be linked with this library Interface definition for libcmix.a; included by the generated extension Abstract interfaces for standard headers such as <stdio.h>. (The file names may be different if your system has C-Mix/II installed for multiple architectures).