[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

9. Special Forms

As discussed in Equations and Expression Evaluation, the Q interpreter evaluates expressions in applicative, i.e., leftmost-innermost, order. This means that the arguments of a function are usually evaluated before the function is applied to them, which is also known as call by value, strict or eager evaluation. Occasionally, it is useful or even essential to defer the evaluation of arguments until they are actually required in the course of a computation. For this purpose, the Q language lets you introduce so-called special forms which receive their arguments unevaluated (i.e., using call by name or non-strict evaluation). This chapter discusses how these constructs are defined and used.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

9.1 Basic Concepts

Consider the following definition from the standard library which implements a simple kind of conditional expression (see also Conditionals and Comprehensions):

 
ifelse P X Y            = X if P;
                        = Y otherwise;

The ifelse function takes as its first argument a truth value which is used to determine whether the second or third argument is to be returned as the value of the conditional expression. Although the above definition is perfectly correct, using applicative order evaluation with this definition is clearly inappropriate since all arguments must already have been evaluated before the ifelse function gets applied to them. Since either the second or third argument is simply thrown away, the effort involved in the evaluation of this argument is wasted. As an extreme case, e.g., Y might not have a terminating evaluation at all, in which case the evaluation of ifelse P X Y would not terminate either even though Y is actually not required if P happens to evaluate to true.

Instead, we would like to have ifelse evaluate its arguments only as they are required. In the Q language, this can be done by simply declaring ifelse as a special form. All we have to do is to precede the above equations with the following declaration:

 
public special ifelse P X Y;

The syntax of such declarations has already been introduced in Declarations. The special keyword must be followed by a function identifier and a sequence of variable symbols (the variable symbols only serve to specify the number of arguments, and are otherwise treated as comments). If the argument list is omitted, the function symbol is actually treated as an ordinary (non-special) symbol. Otherwise the given arguments are declared as special (a.k.a. call by name) arguments, i.e., arguments which are treated as literals and hence are not evaluated when the given function is applied to them. Special arguments become so-called deferred values (also known as thunks or suspensions in the literature) which will be left unevaluated as long as they are "protected" by a special form. But as soon as the value of a thunk is referred to in a "non-special" context, the deferred value will automatically be evaluated by the interpreter.

Thus the above declaration of the ifelse function tells the Q interpreter that this function expects three special arguments P, X and Y which should be left unevaluated until their values are actually required in the qualifier or the right-hand side of an equation in ifelse's definition. Consequently, when the interpreter comes to consider the first rule,

 
ifelse P X Y            = X if P;

it will first evaluate P to obtain the value of the condition part of this rule. If P evaluates to true, X will be evaluated and returned as the value of the left-hand side expression. Otherwise (if P evaluates to false), X will be left unevaluated, and the interpreter considers the next rule,

 
ifelse P X Y            = Y otherwise;

which causes it to reduce ifelse P X Y to the value of Y.

(Note that in cases like the above example, where a special argument forms the right-hand side of a rule, the interpreter performs tail call elimination as usual (cf. Tail Recursion). That is, it will actually evaluate the special argument after performing a tail reduction of the rule. This is important for simple recursive functions involving special forms like ifelse which can be executed in constant stack space. This also applies, in particular, to the built-in special operators (and then) and (or else), see Built-In Special Forms.)

It is important to note here that special forms are indeed a runtime feature in the Q language. This means that special forms are not only recognized at compile time, but also when they are passed as arguments or returned as function results in the course of a computation (which usually cannot be predicted at compile time). For instance, if we define foo as follows:

 
special bar X;
foo                     = bar;

then foo (1+1) evaluates to bar (1+1) and not to bar 2, although foo itself is not declared to be a special form. This also works if you pass bar as a functional parameter:

 
special bar X;
foo F X                 = F (X+1);

Given the above definition, foo bar 1 will evaluate to bar (1+1).

On the other hand, you must take care that arguments to special functions which are passed on by other functions are not evaluated too early. For instance, if the apply function is defined as follows:

 
apply F X               = F X;

and is then invoked as apply bar (1+1) then (1+1) will be evaluated even though bar is a special form. The reason is that the argument (1+1) is evaluated before apply is applied to it, since we have not declared apply as a special form as well. As a general rule, you should make any function a special form that passes its arguments to another special form, unless you explicitly wish to evaluate these arguments.

Up to now, we have only considered "pure" special forms, which receive all their arguments unevaluated. Many functions, however, require a mixture of special and non-special arguments. The Q language provides two different methods for dealing with such situations. First, you can force the evaluation of a special argument at runtime using the `~' (tilde or "force") operator:

 
special foo P X;
foo P X                 = ifelse ~P (bar X) X;

Here, the condition P is evaluated before it is passed on to the ifelse special form (which is as defined above). This method is useful if we only occasionally have to evaluate a parameter of a special form. (Note that you can also force expressions outside a special form; in this case `~' acts as the identity operation, i.e., the argument expression (which is already in normal form) is simply returned as is.)

If we always want to evaluate a given parameter of a special function, we can also declare the corresponding argument as non-special which effectively turns the argument into an ordinary call by value parameter. This is done by preceding the argument with the `~' symbol in the declaration of the function symbol. For instance, in the above definition of the ifelse function the first argument will be evaluated anyway. Hence we may as well make it a non-special argument. The corresponding declaration reads as follows:

 
public special ifelse ~P X Y;

With this declaration (which is precisely how ifelse is actually declared in the standard library), the ifelse function always receives its first argument evaluated, while the other two arguments are special. This works exactly like an explicit application of the `~' operator, but releases you from the burden of having to force evaluation of the argument in every application of the ifelse function. It is also slightly more efficient since the argument does not have to be constructed as a literal expression, but can be evaluated right away before the ifelse function is applied to it.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

9.2 Special Constructors and Streams

Special forms prevent their arguments from being evaluated until they are used in a context which is not "protected" by another special form. This also works if the special form happens to be a constructor. Special constructors thus allow you to define algebraic data structures which act as containers for deferred expressions. By these means Q enables you to create "lazy" data structures which work pretty much like their counterparts in non-strict functional languages like Haskell. A prime example for this are streams, the lazy list data structure, which was briefly introduced in Lists, Streams and Tuples. As of Q 7.1, this data type is now predefined as follows:

 
public type Stream = special const nil_stream, cons_stream X Xs;

The constant symbol nil_stream denotes an empty stream, while the special constructor symbol cons_stream is used to represent a nonempty stream with head X and tail Xs. (Please note that, as of Q 7.1, these symbols are now builtins and thus cannot be imported from the stream.q module any more.) As of Q 7.1, the language also provides some syntactic sugar for these constructs to make them look more list-like. In particular, {} is the same as nil_stream and {X|Xs} the same as cons_stream X Xs. More generally, the notation {X1,X2,…,Xn|Y} can be used as a shorthand for cons_stream X1 (cons_stream X2 … (cons_stream Xn Y)…). Likewise, {X1,X2,…,Xn} is the same as cons_stream X1 (cons_stream X2 … (cons_stream Xn nil_stream)…).

Just like lists, a stream consists of a head element, X, and a stream of remaining elements, the tail Xs. The standard library module stream.q extends most list operations to streams, see Streams. For instance, the list operations hd and tl are easily translated to streams as follows:

 
hd {X|_}                = X;
tl {_|Xs}               = Xs;

Lisp programmers should note that this implementation does not require any explicit "delay" and "force" operations; all necessary evaluations are performed implicitly by the Q interpreter. As a simple example, consider the following definition for the stream of all integers starting from N:

 
ints N                  = {N|ints (N+1)};

Obviously, this definition would not work with eager evaluation, because the recursive invocation of ints would send the interpreter into an infinite recursion. However, since streams are implemented as a special form, the evaluation of the nested ints term is deferred, and the tails of the stream are only produced as we access them, using "call by need" evaluation:

 
==> ints 1
{1|ints (1+1)}

==> tl _
{2|ints (2+1)}

==> tl _
{3|ints (3+1)}

Now this is all good and fine, but it also raises a question, namely how are we going to write definitions which match against subterms in a deferred value if the needed parts have not been evaluated yet? For instance, suppose that we want to extract the next three elements from the above stream. We would actually like to write something like the following:

 
==> _
{3|ints (3+1)}

==> def {X,Y,Z|_} = _

==> (X,Y,Z)
(3,4,5)

In fact this works as expected, even though {3|ints (3+1)} does not literally match the pattern {X,Y,Z|_}, because the Q interpreter takes into account special data constructors during pattern matching and will automagically evaluate deferred subterms as needed.

This also works with non-special arguments of a function in equations. E.g., consider the following definition of a deinterleave operation which takes adjacent stream elements and turns them into pairs:

 
deinterleave {}         = {};
deinterleave {X,Y|Xs}   = {(X,Y)|deinterleave Xs};

Note again that the pattern {X,Y|Xs} on the left-hand side of the second equation does not match a stream value like ints 1 ⇒ {1|ints (1+1)} literally, but it does match after the embedded deferred subterm ints (1+1) has been reduced to {2|ints (2+1)}. The interpreter performs the recursive evaluation automatically, and hence we have:

 
==> deinterleave (ints 1)
{(1,2)|deinterleave (ints (2+1))}

==> tl _
{(3,4)|deinterleave (ints (4+1))}

This "call by need" evaluation of argument values which is performed automatically during pattern matching, is also known as call by pattern. In the Q language, call by pattern evaluations are always done in variable definitions, and also when matching against the left-hand sides of equations, but only for non-special toplevel arguments.

In contrast, special toplevel arguments in an equation are always passed unevaluated, "as is"; that's their purpose after all. Thus, e.g., if foo is a special form then a pattern like foo {X,Y,Z|_} will only match if the provided stream argument literally has three elements in front; no implicit evaluation of stream tails to obtain the required number of stream elements will be performed in this case.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

9.3 Memoization and Lazy Evaluation

Another issue arising when dealing with special forms is the problem of repeated evaluations of the same deferred expression. Note that normally a special argument is evaluated each time its value is required on the right-hand side of a definition. Consider, for instance:

 
special foo X;
foo X                   = bar X X;

Assuming that bar is non-special, X actually gets evaluated twice, once for each occurrence on the right-hand side of the equation. This may be undesirable for reasons of efficiency. One method to cope with such a situation is to introduce a local variable binding, e.g.:

 
special foo X;
foo X                   = bar Y Y where Y = X;

However, in some cases this may be inconvenient or downright impossible. In particular, the deferred value might actually be stored inside a special data element such as a stream (cf. Special Constructors and Streams). For instance, consider the case of a stream map foo {1..} where foo is a function on integers. Then, whenever we access an element of this stream, the corresponding expression foo I will have to be recomputed, which might be costly, depending on the function foo at hand.

To avoid such inefficiencies, the Q language provides the `&' (ampersand or "memoize") operator. When applied to a special argument (such as a stream member) it causes the value to be memoized. This means that the value will only be computed the first time it is needed. Each subsequent access will then simply return the already computed value.

In our example we can memoize both the heads and the tails of the stream, by applying `&' recursively to all heads and tails. The standard library contains the function lazy which does just this. All we have to do is to apply this function to the existing stream to have all elements memoized:

 
==> var foo = \X.writes $ "foo: "++str X++"\n" || X

==> def S = lazy (map foo {1..})

==> list (take 2 S)
foo: 1
foo: 2
[1,2]

==> list (take 4 S)
foo: 3
foo: 4
[1,2,3,4]

Note that when computing the first four stream members with list (take 4 S), only elements number 3 and 4 were computed, since the first two elements had already been computed with list (take 2 S) before.

Functional programming theorists also denote the above style of computation, which combines call by name with the memoization of already computed thunks, as call by need or simply as lazy evaluation.

Lazy evaluation paves the way for some important and powerful programming techniques. In particular, streams are a useful device to implement different kinds of backtracking and dynamic programming algorithms in a uniform setting. For instance, here is a recursive definition of the stream of all Fibonacci numbers:

 
fibs = {0,1|zipwith (+) fibs (tl fibs)};

Unfortunately, this implementation is rather slow because of the multiple recursive invocations of fibs. But this can be fixed by memoizing the stream with lazy and assigning it to a variable:

 
var fibs = lazy {0,1|zipwith (+) fibs (tl fibs)};

By these means we ensure that the stream is built in-place, by "chasing its own tail". This works because Q's global variable environment uses dynamic binding, and the (unevaluated) instance of fibs on the right-hand side of the definition will not be evaluated before the variable has already been defined. Thus the stream effectively includes circular references to itself.

The reason that special arguments are not memoized automatically at all times is that some arguments may involve functions with side-effects, in which case memoization is often undesirable. For instance, we can compute a list of five random values quite conveniently using a list comprehension as follows:

 
==> [random : I in [1..5]]
[1809930567,1267130897,1713466763,574472268,2355190805]

This works because the random subterm is actually a special parameter to the standard library function listof implementing list comprehensions (see Conditionals and Comprehensions), so it will be recomputed as each list element is produced. However, if we memoize the random subterm then the function will only be evaluated once to give the value for all list members:

 
==> [&random : I in [1..5]]
[3194909515,3194909515,3194909515,3194909515,3194909515]

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

9.4 Built-In Special Forms

Besides streams, the Q language has a number of other built-in operations which are actually implemented as special forms. This is necessary, in particular, in the case of the syntactic equality operator `==' (see Non-Linear Equations), as well as the logical connectives and then and or else which are evaluated in "short-circuit mode", using the following built-in equations (see Logical and Bit Operators):

 
true and then X         = X;
false and then X        = false;

false or else X         = X;
true or else X          = true;

The first argument of these operations is non-special, but the second one is a special argument which is only evaluated if it is needed. For instance, if the first argument of and then is false, false is returned - there is no need to take a look at the second argument. The or else operation works analogously. These rules allow the Q interpreter to perform the following reductions immediately, without ever having to evaluate the second argument X:

 
false and then X        ⇒ false
true or else X          ⇒ true

On the other hand, if the first argument of and then is true (or the first argument of or else is false), then the value of the second argument is returned:

 
true and then X         ⇒ X
false or else X         ⇒ X

In this case the interpreter also performs tail call optimization, i.e., the reduction to X is actually carried out before X is evaluated. This implies that tail recursions involving and then and or else are executed in constant stack space, as one might reasonably expect. (The same applies to user-defined special forms like ifelse, see Basic Concepts, which are defined in a similar fashion.) For instance, consider the following example from the standard library which defines a function all that checks whether all members of a list satisfy a given predicate P:

 
all P []                = true;
all P [X|Xs]            = P X and then all P Xs;

Note that if P X in the second rule evaluates to true, then the right-hand side immediately reduces to the tail-recursive call all P Xs, and hence an application of all P to a list of arbitrary size only needs constant stack space.

Other important built-in special forms are the lambda function, which is described in Lambda Abstractions, and the view construction function view, see Views.

Last but not least, there are three other builtins in the Q language which may also act as special forms depending on the arguments with which they are invoked. The function composition operator `.' automagically adapts to special forms in its left and/or right operand. More precisely, if the first argument of a function F is special, then the second operand of a composition section of the form (F.) = (.) F is special as well; in this case F.G will always leave G unevaluated. Furthermore, the argument of a composition F.G is special if F or G has a special first argument. This makes the `.' operator work as expected in situations where the composed functions are special forms. The infix application operator `$' (cf. Application and Sequence Operators) and the built-in flip function (which flips the arguments of a binary function and is used to implement operator sections with a missing left operand, see Miscellaneous Functions) adjust to their first (function) argument in a similar fashion.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

9.5 The Quote Operator

Another built-in special form is the predefined quote operator `'', which acts as a constructor that protects its single argument expression from being evaluated. This construct is useful if you wish to treat certain expressions (which may or may not be in normal form) as literal objects rather than having them evaluated. For instance:

 
==> '(1+1)
'(1+1)

Lisp programmers should note that the quote operator is in fact a constructor, and not an operation which returns its single argument unevaluated. This is essential since in the Q language an expression only remains unevaluated as long as it is protected by a special form - an expression which occurs outside the context of a special form is always in normal form.

Another fact that deserves mentioning is that `'' is just an "ordinary" special form, and does not involve any special "magic" on the side of the interpreter. In particular, `'' does not inhibit variable replacements in rules like the following:

 
special foo X;
foo X                   = '(X+1);

Given the above definition, foo Y ⇒ '(Y+1), i.e., the X variable on the right-hand side is not treated as a literal. However, as the very same example shows, you can employ `'' to quote a free variable, and thereby defer its evaluation:

 
==> def Y = 99; foo Y
'(Y+1)

==> foo ~Y
'(99+1)

The force operator `~' works in quoted expressions as usual:

 
==> '(1+~(2+3))
'(1+5)

Moreover, there is another, similar operation, the ``' (backquote or "splice") operator, which forces evaluation of its argument like the `~' operator, but also "unquotes" the result. For instance, using the same foo function as above, we have:

 
==> '(`(foo Y)/2)
'((Y+1)/2)

Like the force operator, the splice operator also works outside a special form, in which case the unquoted expression is evaluated as usual. Moreover, if the evaluated argument is an unquoted expression, it is returned as is; in this case, ``' does exactly the same as the `~' operator.

The splice operator is the fundamental operation when constructing quoted expressions from smaller quoted pieces, by "splicing" the pieces into a "template" expression. Lisp programmers will be familiar with this technique. However, there are some notable differences between Q's `~'/``' and Lisp's `,'/`@,' constructs:

Also note that, in contrast to the quote operator, the force and splice operations do need special support from the interpreter. They are the only operations (besides the memoization operator, which is treated in a similar fashion) which are evaluated while a special form is being processed.

When used in concert, the quotation operators `'', ``' and `~' become powerful tools for the symbolic manipulation of literal expressions. They allow you to define functions which analyze and transform an expression before the modified expression is finally evaluated. As a simple (and somewhat contrived) example, let us write a function which takes an arbitrary quoted expression and replaces each instance of a foo X subterm by a corresponding bar X expression. (More meaningful examples can be found in the standard library; in particular, have a look at the cond.q script to see how the listof function is implemented.)

 
foo X                   = X-1;
bar X                   = X+1;

special foobar X;
foobar '(foo X)         = '(bar `(foobar 'X));
foobar '(X Y)           = '(`(foobar 'X) `(foobar 'Y));
foobar '(X|Y)           = '(`(foobar 'X)|`(foobar 'Y));
foobar '[X|Y]           = '[`(foobar 'X)|`(foobar 'Y)];
foobar X                = X otherwise;

To see foobar in action on a lambda abstraction, try the following:

 
==> var f = '(\X Y.X+foo (Y+foo (X*Y))), g = foobar ~f

==> f; g
'(\X . \Y . X+foo (Y+foo (X*Y)))
'(\X . \Y . X+bar (Y+bar (X*Y)))

==> `f 12 14; `g 12 14
192
196

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

9.6 A Bit of Reflection

Let us briefly comment on Q's special forms versus "real" lazy evaluation in functional languages like Haskell. It should be clear by now that Q allows you to keep a special argument from being evaluated as long as you like, maybe "to eternity". That is, the interpreter actually computes so-called "weak" normal forms in which special arguments are left unevaluated, even if these arguments are reducible. This is similar to the way deferred evaluation is handled in traditional functional languages featuring a basic eager evaluation strategy. In contrast, a pure lazy functional language interpreter will eventually reduce each expression to a "true" normal form to which no definition is applicable.

Both approaches have their pros and cons. The nice thing about normal order evaluation (with automatic memoization) in lazy functional languages is that it "just works" and is guaranteed to be optimal (in the sense that the program never computes a value that is not needed to produce the program's result). However, one downside of this method is that it becomes hard to deal with operations involving side-effects. Moreover, normal order evaluation is in fact only truly lazy for the lambda and combinatorial calculi used by these systems, which are very special kinds of rewriting system. In general term rewriting, however, there is no single "optimal" evaluation strategy; in fact, the problem to devise a terminating, let alone an optimal reduction strategy for a given rewriting system, even for a single input expression, amounts to solving the infamous halting problem which is undecidable.

Although the Q interpreter uses applicative order evaluation by default, special forms effectively give you full control over the evaluation strategy. As we have seen, this makes it possible to implement lazy data structures which behave pretty much like their counterparts in pure lazy languages. But special forms actually go way beyond this, as they also provide a device to program "meta" (or "macro"-like) functions which operate on the descriptions of other functions (or any other kind of literal expressions). These facilities in turn make it possible to write programs performing "reflective" computations in which a program inspects some part of itself (which might be given, say, as a lambda abstraction), and modifies this part before actually executing it.

It goes without saying that this is an essential requisite in advanced applications like artificial intelligence. While Q is not a fully reflective language (the equational rules making up a Q program are no first-class citizens of the language, neither are the modules a program consists of), having symbolic expressions as first-class citizens of the language provides enough reflectiveness so that you can do most stuff that you'd normally use Lisp for. Pure lazy languages like Haskell fall short in this respect, because they cannot deal with function expressions on a symbolic level without introducing an extra level of interpretation.


[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by Albert Gräf on February, 23 2008 using texi2html 1.76.