Ukrainian translation by All Science Magazine
View this page in Polish courtesy of Science Translations.
View the Slovenian translation created by Gasper Halipovich.
View the Swedish translation created by Mary Stefanova.
There are two important kind of meta-programming scenarios: program generators, and program analyses. Each of these scenarios has a number of distinguishing characteristics
A program analysis (a meta-program) observes the structure and environment of an object-program and computes some value as a result. Results can be data- or control-flow graphs, or even another object-program with properties based on the properties of the source object-program. Examples of these kind of meta-systems are: program transformers, optimizers, and partial evaluation systems.
With the string encoding, we represent the code fragment f (x,y)
simply as "f(x,y)"
. While constructing and combining fragments represented by strings can be done concisely, deconstructing them is quite verbose. More seriously, there is no automatically verifiable guarantee that programs thusly constructed are syntactically correct. For example, "f (,y)"
can have the static type string, but this clearly does not imply that this string represents a syntactically correct program.
With the datatype encoding, we can address the syntactic correctness
problem. A datatype encoding is essentially the same as what is called abstract syntax or parse trees. The encoding of the fragment
"f(x,y)"
in an SML datatype might be:
Apply(Variable "f",Tuple [Variable "x" ,Variable "y"])
using a datatype declared as follows:
datatype exp = Variable of string
| Apply of (exp * exp)
| Tuple of exp list
Using a datatype encoding has an immediate benefit: correct
typing for the meta-program ensures correct syntax for all
object-programs. Because SML supports pattern matching over
datatypes, deconstructing programs becomes easier than with the string
representation. However, constructing programs is now more verbose
because we must use the cumbersome constructors like Variable
,
Apply
, and Tuple
.
The Quasi-quote representation is an attempt to over come this limitation. Here the actual representation of object-code is hidden from the user by the means of a quotation mechanism. Object code is constructed by placing "quotation" annotations around normal code fragments.
In the remainder of this document we will use the quasi-quotation notation of the MetaML meta-programming system. In MetaML quasi-quotation is just one of several staging annotations. In MetaML the staging annotations are Brackets
< >
, Escape ~
, lift
, and run
. An expression < e >
builds the code representation of e
; ~ e
splices the code obtained by evaluating e
into the body of a surrounding Bracketed expression; and run e
evaluates e
to obtain a piece of code, and then evaluates this piece of code. It is important to note that ~ e
is only legal within lexically enclosing Brackets.
As with the datatype representation of programs, using staging annotations guarantees the the syntactic correctness of object programs using the the type correctness of the meta-programs, but maintains the ease of construction of object-programs. With a staged type system (see below) the type correctness of meta-programs can also guarantee the type correctness (as well as the syntactic correctness) of object programs.
Off-line partial evaluation is an automatic staging meta-system. Consider
the simple partial evaluation function PE
with type
< s -> d -> a > -> (s -> <d -> a>)
PE takes the representation of a program with one static parameter and one dynamic parameter and returns an answer. It analyses this program and automatically produces a annotated program which when given the static parameter as input produces a representation of the function from dynamic parameter to answer. For example:
PE <fn s => fn d => d + (s + 3)>
evaluates to
fn s => <fn d => d + ~(lift (s+3))>
Automatic meta-programming systems save the user the bother of placing the annotations, but the user of an automatic system loses control over the structure of the output.
It has been argued that manually staged programs are hard to write, and are much larger than their un-staged counterparts that could be input into a PE system, thus saving the user a lot of work. With the advent of modern meta-programming systems with quasi-quote staging annotations it remains to be seen if this argument still holds. Our experience has been that manually annotated programs are (within a few percent) the same size as their unstaged counterparts.
Both kinds of systems are useful for representing programs for automated program analysis and manipulation. But there are important advantages to homogeneous systems. Only homogeneous systems can be multi-level, where an object-program can itself be a meta-program that manipulates second-level object-programs. Only in a homogeneous meta-system can a single type system type both the meta-language and the object-language. Only homogeneous meta-systems can support reflection (where the meta-language is rich enough that every primitive on object-programs can be expressed as a meta-program). Only homogeneous meta-systems can support a "run" or "eval" operation in a uniform way. This is what makes run-time code generation possible. Homogeneous systems also have the important pedagogical and usability property that the user need only learn a single language.
A staged type system states precisely the staging constraints of the system, as well as the typing requirements of each stage. One advantage of a staged type system is that meta-programs that manipulate ill-typed object-programs are themselves ill-typed. This provides great feedback to the writers of meta-programs.
Staged type systems are particularly appropriate for homogeneous
systems, because only a single type system is necessary. Such systems
capture the static versus dynamic parameter distinction of partial evaluation. Thus a program f with type:
f :: a -> <d> -> < b -> <c> >
tells us quite a bit about the program.
It says that f is a function with two parameters. the first is a static (known now) parameter of type "a", and the second is dynamic (not known until the second stage) parameter of type "d". This program produces an object program, which is also a (second-level) meta-program, with a single (second-stage) static parameter of type "b" which produces a (third-stage) object program of type "c".
Typed heterogeneous meta-systems are also important. Here the type of the meta-program somehow incorporates the types of the object language. This allows meta-programs to encode only type-consistent (at the object-level) transformations of object code, or type consistent translations from one object-language to another. In this case there are three type systems which must work harmoniously (the meta-level types system, and the two object-level type systems).