No Title

Declarations and Expressions

An ML program is essentially a collection of declarations and expressions. (We can think of the ``main'' program being some particular declaration or function of interest, which makes use of the other declarations.)

Declarations bind identifiers to values, e.g.,

or to functions, e.g.,

Both values and function bodies are specified as expressions.

Every expression and identifier has a type. We can (but almost never need to) associate types explicitly with expressions or identifiers using the notation :type.

The main kinds of expressions are as follows:

Constants, e.g.,

Identifiers (``variables'').

Expressions (continued)

Constructor applications

Function and operator applications

Let-bindings

Conditionals and case expressions

code63

Anonymous functions

Evaluation

ML programs execute by evaluating expressions into values.

Since ML is an ``eager'' language, an expression is evaluated whenever:

it is bound to an identifier in a declaration; or

it is specified as an argument to a function or operator application; or

it is specified as an argument to a data constructor; or

it is subjected to a conditional or case operation; or

it is being returned as the value of a function.

Where there's a choice (e.g., in evaluating the arguments to a pair expression), SML always evaluated left-to-right.

The only places an expression isn't evaluated ``immediately'' are when:

it appears as the body of a function (it is evaluated only when the function is applied); or

it appears as one arm of a conditional or case expression (it is evaluated only if that arm is selected).

What does it mean to evaluate an expression?

Let-bindings

Declarations can appear:

at top-level, where they become globally visible in all subsequent top-level entries; or

in let expressions.

The purpose of let is to restrict the scope of a declaration to a limited part of the program.

Often used in functions just like ``local variables'' in other languages:

code73

But a let expression can appear anywhere an expression can:

A let expression is always evaluated by evaluating the RHS of the declaration to a value, binding that value to the declared identifier and then evaluating the body of the expression.

Declaration Lists

A sequence of declarations following the let keyword is evaluated in order, just as if it were a nested sequence of lets, e.g.:

code83

is just like:

code85

When we begin to consider recursion, things will get a little more complicated.

Scope in the Interactive System

Declarations entered at the read-eval-print loop are semantically like nested let declarations, where the scope of the declaration extends indefinitely (until the end of the interactive session). E.g.,

code89

is equivalent to

code91

Or, more compactly:

code93

Cases and Conditionals

A case expression allows a value of a constructed type (pairs, lists, etc.) to be analyzed into its components.

Each rule of a case is specified by giving a pattern and an expression to be evaluated if that pattern matches the data value being ``cased over''. The pattern specifies:

Which data constructor was used to construct the value

Identifier names to be bound to the subcomponents of the value

code100

The first rule matches list values constructed with :: (i.e., non-empty ones), and binds the head element of the list to x and the remainder of the list to y. These variables can then be used on the right-hand side of the matching rule (only).

The second rule matches list values constructed with nil (i.e., empty ones); since such values have no sub-components, there are no variables in the pattern.

We'll see much more sophisticated patterns later.

Conditionals; Derived Forms

Conditional expressions analyze a boolean-valued expression and, depending on the outcome, evaluate one of two sub-expressions:

code108

Note that both then and else expressions must always be specified in order that the if expression is given a value as a whole.

In fact, the if expression is really just a (syntactic) shorthand for a case expression over the boolean type. E.g.,

code116

This is an example of a derived form: a piece of source-language syntax that is defined (in the language reference manual) by macro-expansion into core language syntax.

You shouldn't need to worry about whether some syntax is core or derived. Unfortunately, you sometimes do, because compiler error messages are reported in terms of the core syntax translation.

Conjunction and Disjunction

The boolean operators andalso (not and !) and orelse are also derived forms for case expressions, e.g.,

code125

This definition makes it clear that andalso is a ``short-circuiting'' operator; is evaluated only if is known to be true; so is orelse.

We'll shortly see that most operators in ML are just like functions. Why can't andalso and orelse be?

Function Application and Scope

As in other languages, function applications are evaluated by:

evaluating the actual argument;

binding the resulting values to the formal parameter of the function;

evaluating the body of the function with those bindings in effect; and

returning the value of that body expression as the function result.

code132

What if a function body mentions a non-local identifier (a free variable)? Such an identifier must be in scope, with a value, at the point where the function is defined:

code136

A Crucial Fact

Only the values is scope at the point of the function definition matter. Subsequent redefinitions of the the identifier are irrelevant!:

code139

code141

Operators

The standard arithmetic, string, and logical operators are just ``built-in'' functions; they generally obey all the same rules as user-defined functions.

Most of the standard binary operators are infix, so that you write them between their operands, rather than in usual prefix style for function application, e.g.

instead of

Actually, any function can be made infix by an appropriate declaration, and its associativity and precedence can also be declared. E.g., the ``built-in'' declarations for the integers are:

Once an symbol is declared infix, it can be still be used in a prefix fashion by adding the keyword op, e.g., can write

Anonymous functions

fun declarations bind functions to names. ML also has function expressions which allow you to define anonymous functions.

For example:

is the (anonymous) function that adds 1 to its argument.

We can use a function expression wherever a function name would make sense, e.g.,

or, a little more curiously,

Naturally, function expressions can also be bound to names, e.g.,

Anonymous functions (continued)

In fact, the usual function declaration syntax (using fun)

is just a derived form for the fn binding above.

Note: fn fun !!

We'll see good uses for anonymous functions later.

Meanwhile, what should it mean to evaluate a function expression?

Clearly, the body of the function should not be evaluated (this happens only later, when the function is applied).

One thing that does happen (in the underlying implementation) is that the values of any free variables in the body are recorded for use when the function is later applied. You don't need to worry about this explicitly, but it can be handy to remember when you're trying to figure out how free variables work!

Andrew P. Tolmach
Thu Apr 10 18:49:31 PDT 1997