10. Built-In Functions

Besides the built-in operators already discussed in Expressions, the Q language also provides a collection of predefined functions which cover a variety of elementary operations. The built-in functions can be divided into eight major groups, arithmetic and numeric functions, string, list and tuple functions, conversion functions, I/O functions, lambda abstractions, exception handling functions, and miscellaneous functions. We will describe each of these in turn.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

10.1 Arithmetic Functions

The Q language provides the following functions for simple arithmetic on enumeration types (cf. Types):

enum Xs Y, enum_from Xs: enumerate a range of enumeration type values
succ X: successor function
pred X: predecessor function

As already discussed in Enumeration Types, the enum and enum_from functions construct a list from a range of enumeration type values, and the succ and pred functions produce the successor and predecessor of a given value in an enumeration type. Note that while the built-in definitions of these operations only apply to "real" enumeration types (including the built-in Char type), the standard prelude extends enum, succ and pred to integers (and, in the case of enum, floating point values), cf. The Standard Library.

Two additional functions are provided for integer arithmetic, shl and shr, which are used to shift the bits of an integer value a given number of bit positions to the left and right, respectively:

shl N COUNT: shift N COUNT bits to the left
shr N COUNT: shift N COUNT bits to the right

If the COUNT argument is negative, then the opposite shift by -COUNT bits is performed, i.e., shl X COUNT = shr X (-COUNT).

These operations provide the fastest possible way to multiply an integer with, and divide it by, a power of two. More precisely, if X is an arbitrary and N a nonnegative integer, then shl X N and shr X N are the same as X*pow 2 N and X div pow 2 N, respectively, where pow 2 N denotes the Nth power of 2, as an integer. Note that in contrast to bit shifts on fixed size machine integers, there is no "loss" of most significant bits, because Q integers have arbitrary precision; hence the results obtained with bit shifting are always "arithmetically correct".

The bitwise logical operations provide a means to implement sets of nonnegative integers in a fairly efficient manner. As usual, such "bitsets" are obtained by turning on exactly those bits whose positions are the members of the set. Thus, 0 encodes the empty set, and shl 1 I the set consisting of the single member I. You then employ or as set union, and as set intersection, and not as set complement. Set difference can be implemented by taking the intersection with the complement set.

For convenience, you might wish to define yourself a bit function as follows:

bit                     = shl 1;

Then you can test for membership using an expression like X and bit I which returns nonzero (namely bit I itself) iff I is in the set. You can also build a set from its members by or'ing the corresponding bit values, e.g.: bit 1 or bit 4 or bit 39.

Because Q integers have arbitrary precision, bitsets do not have an a priory size restriction in Q (in fact, they can even be infinite, as negative integers are used to encode the complements of the finite bitsets). However, you should note that a finite bitset (a.k.a. a nonnegative integer) needs space proprotional to the set's largest member, and hence this method will be practical only if the potential set members are within a reasonable range.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

10.2 Numeric Functions

exp X: exponential function
ln X: natural logarithm
sqrt X: square root
sin X: sine
cos X: cosine
atan X: arcus tangent
atan2 Y X: arcus tangent of Y/X
random: random number
seed N: initialize random number generator

This group includes the usual trigonometric and exponential functions, logarithms and square root. All these functions take both integer and floating point arguments. They return a floating point number. The atan2 function is like atan, but takes two arguments and computes the arcus tangent of their ratio. In difference to atan, it takes into account the signs of both arguments to determine the quadrant of the result.

Besides this, there is a parameterless function random which returns a 32 bit pseudo random integer in the range 0..2^32-1. To obtain a random 32 bit floating point value in the range [0,1], you simply divide random by 0xffffffff. The current implementation uses the "Mersenne Twister", a fast uniform random number generator with a period of 2^19937-1, written by Makoto Matsumoto and Takuji Nishimura. The generator is initialized automatically with a seed taken from the system clock at the time the interpreter starts up. The seed function allows to initialize the generator explicitly with a given nonnegative integer seed; this is useful if it is necessary to reproduce random sequences. Note that only the least significant 31 bits of the given seed value are actually used; these 31 bits are padded with a constant 1 bit, since the generator works best with odd seed values.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

10.3 String, List and Tuple Functions

This group provides some additional operations on sequences (strings, lists and tuples).

sub Xs I J: extract the subsequence consisting of members I thru J of a string, list or tuple Xs
substr S K L: return substring of length L at position K in S
pos S1 S: position of substring S1 in S

The sub function returns a subsequence (also called a "slice") of a string, list or tuple, given by the zero-based index values I and J. For instance:

==> sub "abcde" 2 3
"cd"

==> sub [a,b,c,d,e] 2 3
[c,d]

==> sub (a,b,c,d,e) 2 3
(c,d)

(The standard library provides an analogous definition of sub on streams, see Streams.)

The sub function handles all combinations of index arguments in a reasonable manner. Thus, if J < I or I >= #Xs then an empty sequence is returned, I = 0 is assumed if I < 0, and the remainder of the sequence is returned if J >= #Xs.

Besides sub, there are two additional functions operating exclusively on strings. The substr function returns a substring of a string with a given length at a given position. This function is provided for backward compatibility; note that substr S K L is equivalent to sub S K (K+L-1). The pos function returns the position of a substring in a string (-1 if there is no occurrence of the given substring). For instance:

==> pos "cd" "abcde"
2

==> substr "abcde" 2 2
"cd"

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

10.4 Conversion Functions

This group consists of functions for converting between different kinds of objects:

trunc X: truncate floating point to integer value, or return integer value unchanged
round X: round floating point to nearest integer value, or return integer value unchanged
float X: convert integer to floating point number, or return floating point value unchanged
int X: compute the integer part of floating point number, or convert integer value to floating point
frac X: compute the fraction part of floating point number, or return 0.0 for an integer value
hash X: 32 bit hash code of an expression
ord X: ordinal number of enumeration type member (or character)
chr N: character with given ordinal number
list Xs: convert a tuple to a list
tuple Xs: convert a list to a tuple
str X: convert expression to string
strq X: convert quoted expression to string
val S: convert string to expression
valq S: convert string to quoted expression

Note that, as of Q 7.1, the trunc, round, float, int and frac functions work consistently on all numbers, i.e., both integers and floating point values. (With older versions, float would only work on integers and the other functions only on floating point values.)

The ord and chr functions convert between characters and their ordinal numbers in the (Unicode) character set. The ord function can also be used to compute the ordinal number of a member of an arbitrary enumeration type, see Enumeration Types.

The hash function returns a nonnegative 32 bit hash code for an arbitrary expression. The only guarantee is that syntactically equal objects are mapped to the same hash code. This function is useful, e.g., for implementing hashed dictionaries which map arbitrary keys to corresponding values, see Hashed Dictionaries for an application.

The str function converts its argument expression to the corresponding print representation conforming to the Q expression syntax, which is returned as a string; the argument is evaluated as usual. The strq function is similar, but converts a quoted expression (cf. The Quote Operator). For instance:

str (1+1)               ⇒ "2"
strq '(1+1)             ⇒ "1+1"

The val/valq functions are the counterpart of str and strq; they convert a string containing the print representation of an expression back to an expression. In the case of val, the expression is evaluated, while valq returns the parsed expression as it is, protected with the quote operator:

val "1+1"               ⇒ 2
valq "1+1"              ⇒ '(1+1)

These functions require that the contents of their string arguments conform to the Q expression syntax. In case of a syntax error, the built-in rules fail.

All expression conversion routines use the global scope, i.e., the namespace of the main script, for the purpose of converting identifiers, just like the interpreter itself. Thus these functions work consistently in a given main script no matter which module they are invoked from. However, this also means that the precise results may depend on which main script is currently loaded. Note that this only affects the parsing and unparsing of identifiers. For instance, if a function symbol foo in module bar is visible in the main scope then it will be unparsed simply as "foo", otherwise the qualified form "bar::foo" is used.

Also note that, as of version 7.2, Q now provides a hook into the built-in expression pretty-printer so that you can define custom views for any user-defined, built-in or external data type. This is described in Views.

A word of caution: Having a "universal" string representation of Q expressions is very useful to facilitate interaction with the user and for the purpose of "marshalling" expressions so that they can be stored in a file, but you should note that the expression conversion routines are so powerful that they can easily be abused for rather questionable purposes. For instance, the val function gives you a "backdoor" to access any function or variable symbol in a script, even private symbols in other modules. The only way around this would have been to severely restrict the functionality of the conversion routines, which would have made them much less useful. Thus it is in your responsibility to use these functions in an "orderly" manner.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

10.5 I/O Functions

This group provides functions for handling input/output from/to the terminal or a text file. These functions implement operations with side-effects; the side-effects consist in modifying the terminal display or a file. All I/O functions also take care, automatically and transparently, of the necessary translations of strings between the system and the internal UTF-8 encoding.

fopen NAME MODE: open file NAME in mode MODE
popen CMD MODE: open pipe CMD in mode MODE
fclose F: close a file
read, fread F: read an expression from the terminal or a file
readq, freadq F: read a quoted expression from the terminal or a file
readc, freadc F: read a character from the terminal or a file
reads, freads F: read a string from the terminal or a file
write X, fwrite F X: write an expression to the terminal or a file
writeq X, fwriteq F X: write a quoted expression to the terminal or a file
writec C, fwritec F C: write a character to the terminal or a file
writes S, fwrites F S: write a string to the terminal or a file
eof, feof F: check for end-of-file on the terminal or a file
flush, fflush F: flush output buffer on the terminal or a file

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

10.5.1 Terminal I/O

Input from the terminal (i.e., the standard input device) is done through the parameterless functions read, readq, readc and reads which read and return, respectively, an expression, a quoted expression, a single character, and a string. Terminal input is line-buffered which means that you must type an entire input line before anything is returned by these functions.

The reads function obtains one line of input, strips off the trailing newline character, and returns the result as a string. For instance (here and in the following, <CR> means that you hit the carriage return key to terminate the input line, rather than actually typing the string "<CR>"):

==> reads
one line of text<CR>
"one line of text"

The readc function allows you to read input character by character; at the end of a line the newline character "\n" is returned:

==> readc
<CR>
"\n"

The read and readq functions read one line from the input, as with reads, and convert the resulting string to an expression, as with val and valq, respectively. For instance:

==> read
1+1<CR>
2

==> readq
1+1<CR>
'(1+1)

The corresponding functions for producing output on the terminal (the standard output device) are named write, writeq, writec and writes. They print an expression, a quoted expression, a character and a string, respectively. These functions take one argument, the object to be printed, and return the empty tuple (). For instance:

==> writec "x"
x()

==> writes "one line of text\n"
one line of text
()

==> write (1+1)
2()

==> writeq '(1+1)
1+1()

(Output operations are invoked solely for their side-effects. However, any Q expression must have a value, and since there is no other meaningful value to return, the write functions return (). In the above examples, this value is output by the interpreter after the printed text, which explains, e.g., the x() in response to writec "x".)

It is common practice to combine these operations by means of the || operator in order to implement dialogs such as the following prompt/input interaction:

==> writes "Input: " || reads
Input: one line of text<CR>
"one line of text"

You will also often encounter several output operations for interpolating data into text fragments:

==> writes "The result is " || writeq '(1+1) || writes ".\n"
The result is 1+1.
()

The eof function allows you to check for end-of-file on the terminal. Actually, this causes another line of input to be read from the terminal if no input is currently available, to see whether the end of the input has been reached. Most operating systems allow you to type a special character to indicate end-of-file, such as the Ctl-D character on the UNIX system:

==> eof
<Ctl-D>
true

The flush function writes any pending output to the terminal. It is rarely necessary to call this function explicitly; see also the discussion of the fflush function in File I/O.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

10.5.2 File I/O

File input and output is implemented by the functions fread, freadq, freadc, freads, fwrite, fwriteq, fwritec and fwrites. There also is an feof function which allows to check whether the end of a file has been reached, and an fflush function which flushes the output buffer of a file. These operations are analogous to their terminal equivalents, but take an additional first argument, a file object. A file object is a special kind of elementary object in the Q language which is returned by the built-in fopen function.

The fopen function takes two string arguments, the name of the file to be opened (which is the name under which the file is known by the operating system), and the mode in which the file is to be opened. The mode string "r" means to open an existing file for reading, "w" to open a new file for writing (existing files are truncated to zero size), and "a" to create a new or append to an existing file. You can also add the letter "b" at the end of the mode string to indicate that the file should be opened as a binary file. This only has the effect to suppress the LF/CR-LF conversion on MS-DOS/Windows systems, which is essential if you read/write binary data from/to the file. On UNIX systems this flag is ignored.

For instance, a function checking whether a file exists and is accessible for reading can be implemented as follows:

exists NAME             = isfile (fopen NAME "r");

(A suitable definition of the isfile function is provided by the standard library, cf. Type-Checking Predicates.) Files are closed automatically when the corresponding file objects are no longer accessible in a computation. E.g., with the above definition of the exists function, if you invoke this function as exists "myfile", then the file object returned by fopen "myfile" "r" (assuming that the file actually exists) will become inaccessible as soon as it has been processed by the isfile function, at which point it gets closed. Similarly, if you assign a file to a variable in the interpreter,

==> def F = fopen "myfile" "r"

the file will be closed as soon as you undefine the variable:

==> undef F

Occasionally, it might be necessary to close a file explicitly, e.g., a file object might still be accessible, but you want to close it before you do some other processing. In this case you can invoke the fclose function on the file:

==> def F = fopen "myfile" "r"; fclose F
()

After closing a file, the file object still exists, but all further I/O operations on it (including fclose itself) will fail:

==> freads F; fclose F
freads <<File>>
fclose <<File>>

(The notation <<File>> represents a file object, since file objects have no printable representation. This syntax is also used by the expression output and unparsing functions, write, str etc. Of course, such objects cannot be reparsed using read, val, etc.)

The standard input and output devices used by the terminal I/O functions (the readx and writex functions, see Terminal I/O) are actually special instances of the file I/O functions, which read from standard input and write to standard output. The standard input and output devices are implemented by the interpreter as predefined file objects which are assigned to the INPUT and OUTPUT variables, see Command Language. Thus, for instance, reads and writes X are equivalent to freads INPUT and fwrites OUTPUT X, respectively.

As an example for the use of file I/O, here's a tail-recursive function which copies the contents of one file opened in read mode to another file opened in write or append mode:

fcopy F G               = () if feof F;
                        = fwritec G (freadc F) || fcopy F G otherwise;

Note that file objects are indeed modified by input and output operations; at least, the file pointer is moved accordingly. Otherwise the above definitions would not work. This also becomes apparent when manipulating files interactively:

==> def F = fopen "xyz" "r"

==> freads F
"first line in file xyz"

==> freads F
"second line in file xyz"

…

The fflush function is used to write any buffered output to a file. This operation is only needed when the target file must be updated immediately. For instance, the target file may actually be a pipe and it may be necessary to get an immediate response from the program which reads the output at the other end of the pipe (see Pipes). Then you can use the fflush function as follows:

==> fwrites F S
()

==> fflush F // force S to be written to F immediately
()

Note that if the Q interpreter runs interactively, then it automatically flushes the standard output files whenever an evaluation is finished, see Running Compiler and Interpreter. And, of course, buffered output is always flushed when a file is closed. In case you have to flush standard output explicitly, use flush or, equivalently, fflush OUTPUT. (This should only be necessary if the standard output stream has been redirected to a disk file or a pipe.)

Also note that the interpreter does not provide direct support for reading and writing binary files. However, such functionality is provided by the clib standard library module (see Clib).

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

10.5.3 Pipes

The popen function is used to create a file object connected to a pipe. The file objects constructed with popen allow you to pipe data into an operating system command, and to read the output of such a command from a file. Like fopen, popen takes two string arguments. The first parameter denotes the command to be executed, and the second parameter specifies the mode in which the pipe is to be opened. The mode argument must be either "r" (read from the output of the given command) or "w" (write to the input of the command), and causes a file object open for reading or writing to be returned, respectively. The "b" flag may also be used to open the pipe as a binary file, see File I/O.

Input pipes are typically employed for retrieving information from the host operating system. For instance, on UNIX systems we can use the ls command to obtain a list of filenames matching a given wildcard specification. A corresponding function ls which returns such a list can be implemented as follows:

ls S:String             = freadls (popen ("ls "++S) "r");
freadls F:File          = [] if feof F;
                        = [freads F|freadls F] otherwise;

As an example for an output pipe, the following function more pipes a file through the more program which displays the file page by page:

more F:File             = fcopy F (popen "more" "w");

(The definition of fcopy is as in File I/O.)

Just like ordinary files, pipes are closed automatically when the corresponding file object is no longer accessible, or explicitly by an invocation of the fclose function. Furthermore, the interpreter waits for the command started with popen to finish when closing the pipe.

Output pipes are a convenient means to implement specialized output devices in the Q language. For instance, the standard library script graphics.q writes its graphics output to a file GRAPHICS which is often defined as a pipe to a PostScript previewer like, e.g., Ghostscript:

def GRAPHICS = popen "gs -q -" "w";

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

10.6 Lambda Abstractions

As of Q 7.1, the special form lambda is now a built-in function which returns a "compiled" function object represented by the new built-in Function type. This offers better performance than the previous implementation in the standard library which was based on the combinatorial calculus. Moreover, the \X.Y syntax can now be used as a convenient shorthand for an application lambda X Y, see Conditional Expressions and Lambdas.

Both arguments of lambda are special. The first argument X denotes an expression to be matched against the actual parameter of the function when it gets applied. The second argument Y is the body of the lambda abstraction. When a lambda function is applied to an argument Z, Z is first matched against the pattern X and the (free) variables in X are bound to the corresponding values in Z. Pattern matching is performed in the same manner as for the left-hand side of an equation; in particular, non-linear patterns (which contain a variable more than once) and the anonymous variable work as expected, as do "call by pattern" (cf. Special Constructors and Streams) and matching against "views" (cf. Views). If the match fails then the application of the lambda function fails, too. Otherwise the body of the lambda is evaluated, with the variables of the pattern replaced with their corresponding values, and the resulting normal form is returned as the value of the lambda application.

As in previous releases, the lambda function needs to know which other special forms perform variable bindings so that these constructs can be expanded recursively. To these ends, such "lambda-like" functions should be declared as members of a type derived from the predefined Lambda type, and the predefined lambdax special form should be overloaded such that, when applied to an expression involving the lambda-like function, it returns a quoted expansion of the construct in terms of lambda. Examples for this can be found in the cond.q standard library module.

Examples

The following simple lambda abstraction can be used to denote a function which squares its argument by multiplying it with itself. (In the following we use the customary \X.Y notation throughout, so this example is written as \X.X*X. But note that this is just syntactic sugar for an application of the lambda function, lambda X (X*X) in this case.)

==> \X.X*X
\X1 . X1*X1

==> _ 5
25

In this example the lambda pattern is just a single variable, which always matches. It is also possible to have pattern-matching lambda abstractions, see below. Also note that lambda objects have a printable representation in Q, although the original variable names are lost when a lambda is printed (more about this in the subsection "Lambdas as First-Class Objects" below).

Multi-argument lambda functions can be created using nested lambdas:

==> \X.\Y.(1-X)*Y
\X1 . \X2 . (1-X1)*X2

==> _ 0.9 0.5
0.05

For convenience, the notation \X1 X2 … . Y is provided as a shorthand for \X1 . \X2 … . Y:

==> \X Y.(1-X)*Y
\X1 . \X2 . (1-X1)*X2

==> _ 0.9 0.5
0.05

Also note that in a nested lambda expression each variable is bound by the innermost lambda in which it occurs:

==> \X.(\X.X*X) (X+1)
\X1 . (\X2 . X2*X2) (X1+1)

==> _ 2
9

Moreover, the lambda variables are always bound statically, as determined by the lexical context. For instance, consider:

==> def F = \X.\Y.(1-X)*Y, G = \Y.(1-X)*Y, H = \X.~G

==> F; H
\X1 . \X2 . (1-X1)*X2
\X1 . \X2 . (1-X)*X2

==> F 0.9 0.5; H 0.9 0.5
0.05
(1-X)*0.5

Note that in the function G the variable X occurs free, and hence the same holds for H. On the other hand, in the function F the variable X is bound by the outermost lambda.

The lambda argument can also be a pattern to be matched against the actual parameter when the function is applied:

==> \(X,Y).(1-X)*Y
\(X1,X2) . (1-X1)*X2

==> _ (0.9,0.5)
0.05

==> \[X|Xs].Xs
\[X1|X2] . X2

==> _ [1,2,3]
[2,3]

If a match fails then the application of the lambda function fails, too:

==> (\[X|Xs].Xs) []
(\[X1|X2] . X2) []

The anonymous variable and nonlinear patterns also work as expected:

==> \[_|Xs].Xs
\[X1|X2] . X2

==> _ [1,2,3]
[2,3]

==> \[X,X|Xs].Xs
\[X1,X1|X2] . X2

==> _ [1,1,2]
[2]

==> (\[X,X|Xs].Xs) [1,2,3]
(\[X1,X1|X2] . X2) [1,2,3]

==> (\[_,_|Xs].Xs) [1,2,3]
[3]

But note that in a multi-argument lambda, the different argument patterns will be matched independently from each other. This is because a lambda abstraction like \X X.X*X is equivalent to \X.\X.X*X, so the X variable in the body is actually bound to the second lambda argument (which belongs to the innermost lambda) and the first argument is never used:

==> \X X.X*X
\X1 . \X2 . X2*X2

==> _ 1 2
4

This is to be seen in contrast to an equational definition like `foo X X = X*X;' in which the entire left-hand side participates in the matching process.

Lambdas can also be used to substitute variables in special forms like the quote operator:

==> (\X.'(X+1)) (2*3)
'(6+1)

But note that while the pattern and body of lambda are special arguments, the actual parameter expected by the lambda function is non-special. Thus the value 2*3 in the example above is evaluated before it is substituted for the X variable. If you want to prevent lambda arguments from being evaluated, you have to protect them with a special form (e.g., a quote), and extract the value for use in the lambda body using an appropriate pattern:

==> (\'X.'(X+1)) '(2*3)
'(2*3+1)

Lambdas as First-Class Objects

While Q's lambda abstractions look superficially like those in languages like Haskell and ML, there is a big difference: Lambdas are just normal expressions in the Q language and can thus can be inspected and manipulated just like any other expression. In other words, lambda is just an ordinary function in Q (albeit a special form) which is implemented on top of the term rewriting machinery that Q is based on. This has some important consequences which are discussed in the following, including some incompatibilities to mainstream FP languages that Haskell and ML programmers should know about.

For efficiency reasons, the value returned by lambda is actually a "compiled" form of the given lambda abstraction, which is represented as an object of the built-in Function type. These objects are computed at runtime and contain a complete description of the original lambda abstraction, except that the original names of variables bound in the lambda pattern are lost in the compilation process; this is why you will see generic variable names like X1, X2, etc. in the printed representation.

We also call these objects lambda functions, to distinguish them from lambda abstractions, i.e., the lambda expressions from which they are computed. These objects have an associated "view", with lambda acting as a "virtual constructor" of the Function type (cf. Views), so that lambda objects can be compared for syntactic equality, printed, and dissected by pattern matching. Therefore the difference between lambda abstractions and the corresponding function objects is mostly transparent to the programmer (except for the variable names); for most practical purposes you should be able to work with lambdas just as if they were actually represented the way that they are written.

For instance, in Q you can create a lambda function and then extract its pattern and body simply as follows:

==> var fac = \N.if N>0 then N*fac (N-1) else 1

==> def \X.Y = fac; X; Y
X1
if X1>0 then X1*fac (X1-1) else 1

This makes it easy to manipulate lambdas in various ways before actually applying and thereby evaluating them, which offers great opportunities for "metaprogramming". For instance, a simple kind of `let' expressions can easily be defined in terms of lambda as follows:

special let C X;
let (A=B) X = (\A.X) B;

More examples of this kind can be found in the standard library. For instance, the standard library defines `case' expressions as well as list and stream comprehensions in terms of lambda.

However, this power comes at a price. Since lambdas are first-class objects, the variables "bound" in lambdas are just ordinary expressions, too. In fact, for the Q interpreter the variables inside a lambda pattern are nothing special at all; on the level of equational definitions they are just free variables! Conversely, from the point of view of lambdas, the variables bound in an equation are "meta variables" whose values are substituted into both the pattern and the body of a lambda abstraction. In fact, we used that to good effect in the above definition of `let'.

This leads to a common pitfall, namely that you might accidentally try to use a variable symbol bound by the left-hand side of an equation (or inside a where clause) as a lambda variable, with predictable but unintended results. For instance, consider the following definition:

foo X = (\X . 2*X+1) (X+1);

In Haskell, an equational definition like the one above (i.e., something like foo x = (\x->2*x+1) (x+1)) is essentially considered equivalent to a corresponding lambda abstraction (\x -> (\x->2*x+1) (x+1) in this case), and hence foo 99 will evaluate to 2*(99+1)+1 = 201.

Not so in Q. Here, the left-hand side X acts as a meta variable whose value is substituted into the lambda abstraction, yielding the result (\99 . 2*99+1) (99+1), which probably was not intended here. To get the correct behaviour in this example you can either replace the lambda variable with a symbol which is not bound by the equation, or use an inline variable declaration to escape the lambda variable, like so:

foo X = (\var X.2*var X+1) (X+1);

Of course you can also just define foo itself as a lambda function:

foo = \X . (\X.2*X+1) (X+1);

While this is a little awkward and might take some time to get used to, there is no way around this if lambda abstractions are implemented as first-class citizens as they are in the Q language. So some discipline is needed to avoid the pitfall sketched out above.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

10.7 Exception Handling

As already mentioned, the Q language only knows a few "hard" runtime error conditions a.k.a. exceptions such as stack or memory overflow. However, "soft" exceptions can also be generated and handled in any desired manner. The following functions are used to deal with all kinds of exceptions during expression evaluation:

halt: halt evaluation
quit: exit the Q interpreter
break: invoke the debugger
fail: abort the current rule
_FAIL_: abort the current reduction
catch F X: handle an exception
throw X: raise an exception
trap ACT SIG: trap signals

The halt function never returns a result, but raises an exception which normally causes the evaluation process to be aborted immediately. This last resort is commonly used in case of a fatal error condition. The quit function is like halt, but also causes exit from the interpreter and returns you to the operating system shell; this operation is often used interactively to terminate a session with the interpreter. (Note that in a multithreaded script, see POSIX Threads, quit only terminates the program when it is invoked from the main thread. In other threads it just acts like halt.)

If the break flag is on (see Debugging), the break function interrupts an evaluation and invokes the symbolic debugger built into the Q interpreter, just as if the user typed Ctl-C. This operation allows you to set breakpoints in a script. For instance,

foo X           = break || bar X;

causes the debugger to be invoked as soon as the rule is executed. If the break flag is off then this operation has no effect. In any case, the break function returns ().

The fail function is used to abort the current rule, just like a failing qualifier. The difference is that fail can be used anywhere in the right-hand side or qualifier of a rule, and causes the rule to be exited immediately. This allows you to handle complicated error conditions which occur while a rule is already being executed, and also provides an efficient method for implementing backtracking algorithms. For instance, here is a quick solution for the famous "N queens" problem:

queens N        = search N 1 1 [];

search N I J P  = write P || writes "\n" if I>N;
                = search N (I+1) 1 (P++[(I,J)]) || fail if safe (I,J) P;
                = search N I (J+1) P if J<N;
                = () otherwise;

safe (I1,J1) P  = not any (check (I1,J1)) P;

check (I1,J1) (I2,J2)
                = (I1=I2) or else (J1=J2) or else
                  (I1+J1=I2+J2) or else (I1-J1=I2-J2);

This algorithm prints out all valid placements (i.e., lists of (row,column) pairs) of N queens on an N times N board. Note the use of fail in the second equation for the search function, which causes the rule to fail after the current placement has been tried, after which evaluation proceeds with the third rule which places the queen into the next column. This turns the definition into a backtracking algorithm. The safe function verifies that a placement (I1,J1) does not put the queen in check with any of the other queens which have already been placed.

The _FAIL_ function works in a fashion analogous to fail, but makes an entire reduction fail (not just a single equation). This is similar to throwing an exception (see below) in that the evaluation of the current redex is aborted immediately, but does not actually raise any exception; instead, it effectively forces the redex to be a normal form, regardless of how many subsequent equations might still be applicable. This operation is useful if you want to check for certain error conditions before trying any subsequent equations. For instance:

foo X:Num       = _FAIL_ if X<=0;
                = bar X if X>1;
                = bar (1/X) otherwise;

The catch function allows you to handle both "hard exceptions" which are raised when the interpreter encounters one of the runtime error conditions discussed in Error Handling, and "soft exceptions" which are raised by the functions halt and quit already explained above, and the throw function which is discussed below. Moreover, the interpreter also handles certain signals sent to it, e.g., via the kill(2) system call, by raising appropriate exceptions. In the current implementation, by default only the signals SIGINT ("break"), SIGTERM ("terminate") and SIGHUP ("hangup") are handled; the latter two are both treated as termination requests, like a call to the quit function. Signal handlers can also be installed by the running script using the trap function discussed below.

Both arguments of catch are special. The catch function first evaluates its second argument, the target expression and, if all is well, returns the value of that expression. Otherwise, if any exception was raised during the evaluation of the target expression, it evaluates the first argument (the exception handler), applies it to the value of the exception, and returns the result computed by the exception handler.

In the case of a runtime error condition and the exceptions raised with the halt and quit functions, the exception value is of the form syserr N where N is one of the integer error codes listed below:

1: Break: break signal, user typed Ctl-C.
2: Halt: invocation of the halt function, or request to halt evaluation from the debugger.
3: Quit: termination signal, invocation of the quit function, or request to exit the interpreter from the debugger.
4: Memory overflow: the attempt to allocate memory for the stack or a new expression on the heap fails.
5, 6: Stack overflow: the attempt to push an expression on the stack, or the activation of a rule fails because the stack size limit has been reached (see the comments on memory management in Running Compiler and Interpreter). 5 signals an expression stack overflow, 6 an overflow of the rule stack.
7: Symbol table overflow: the attempt to create a new variable symbol fails.
8: Conditional error: the qualifying condition of a rule does not evaluate to a truth value.
9: External function error: an external function (see C Language Interface) signals an error condition.

Let's try this by forcing a stack overflow condition:

==> iter 100000000 (+) 1 
! Stack overflow
>>> iter 100000000 (+) 1
                        ^

==> catch exception (iter 100000000 (+) 1)
exception (syserr 5)

The syserr constructor belongs to the built-in SysException type, which in turn is a subtype of Exception. These types may be thought of as being predefined as follows:

public type Exception;
public type SysException : Exception = const syserr N;

User-defined exceptions can be generated with throw. The throw function raises an exception whose value is given by its single (non-special) argument. For instance:

hd []           = throw '(hd []);
exception X     = writes "Exception: " || write X || writes "\n";

With these definitions we have:

==> catch exception (hd [1..3])
1

==> catch exception (hd [1..0])
Exception: '(hd [])
()

==> hd [1..0]
! Exception
'(hd [])
>>> hd [1..0]
             ^

Note that if an exception raised with throw is not handled with a corresponding catch, then the interpreter aborts the evaluation and prints an error message (and it will also invoke the debugger to report the rule which generated the exception, if break is on).

The catch and throw functions can also be used to implement non-local value returns. For instance, here is a variation of the N queens function which returns the first solution as soon as it has been found during backtracking:

queens1 N       = catch id (search1 N 1 1 []);

search1 N I J P = throw P if I>N;
                = search1 N (I+1) 1 (P++[(I,J)]) || fail if safe (I,J) P;
                = search1 N I (J+1) P if J<N;
                = () otherwise;

Here, the "exception handler" is just the identity function id, as defined by the standard library, see Standard Functions.

Another neat programming trick is the following which makes a rule fail if an exception occurs during evaluation of the right-hand side:

foo X           = catch fail (bar X);

This works because the handler argument, which is special, is evaluated only if an exception is actually encountered during evaluation of the target expression. Otherwise, foo X will just return the value of bar X. The _FAIL_ function can be used in a similar manner.

Lisp and C++ programmers will probably miss an additional parameter which restricts the kind of exceptions handled with catch. You can implement this easily yourself by checking the exception value in your handler function, and "throw on" an unknown exception to the next enclosing catch:

type MyException : Exception = const empty_list;

hd []           = throw empty_list;

myexception X:MyException
                = writes "myexception: " || write X ||
                  writes "\n" || halt;
myexception X   = throw X otherwise;

Note how we implemented the empty_list exception using our own Exception subtype MyException. This method of structuring error exceptions is recommended because it makes it easier to spot the source of an exception. It also enables us to use a type guard (cf. Types) in order to check for specific exception types, rather than having to discriminate over different exception patterns.

Finally, the trap function allows you to have exceptions be generated when a signal arrives. The trap function takes two arguments, ACT and SIG, denoting the action to be performed and the number of the signal to be trapped. It returns the previous action associated with the signal. ACT must be an integer value. If it is positive then the signal raises an exception of the form syserr (-SIG); if it is negative then the signal is completely ignored; and if it is zero then the interpreter reverts to the default handling of the signal. You can also use trap to redefine the default handling of the break and termination signals. To allow signals to be caught reliably, further signals are "blocked" (i.e., they are queued for later delivery) while catch evaluates an exception handler. For instance, here is a little script which sets up some signal handlers and keeps on printing the trapped signals until it either gets terminated with kill -9 or the timed wait with the sleep function times out. (The sleep function is described below. The symbolic signal constants and the getpid and printf function are provided by the clib module, see Clib.)

test            = do (trap 1) [SIGINT,SIGTSTP,SIGTERM,SIGHUP,SIGQUIT] ||
                  printf "I'm running as pid %d, try to kill me!\n" getpid ||
                  loop;

loop            = flush || catch sig (sleep 100);

sig (syserr K)  = printf "Hey, I got signal %d.\n" (-K) || loop;

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

10.8 Miscellaneous Functions

This group consists of some special operations which do not fit into any of the other groups treated above.

version, sysinfo: determine version/system information
which S: determine absolute pathname of the executing script or of another file on Q library path
time: get the system time
sleep X: pause for some time
isspecial X: predicate checking whether argument is a special form
isconst X: predicate checking whether argument is a constant
isfun X, isvar X: predicates checking whether argument is a function or a variable symbol, respectively
isdef X: predicate checking whether the variable X has been assigned a value
flip F: flip arguments of a binary function

The version and sysinfo functions are useful when writing code which depends on a particular version of the Q interpreter or a machine/operating system type. They return a string describing the Q interpreter version and the host system on which it is installed, respectively.

The which function performs a search for the given filename on the Q library path. If the file is found then the absolute pathname is returned; otherwise the function fails. The function can also be invoked with the argument (), in which case the full pathname of the executing main script is returned. This function is most useful if a script has to locate additional data files which are stored along with it, see Running Scripts from the Shell.

Two functions are provided for timing purposes. The time function returns, as a floating point value, the time in seconds since the "epoch" (00:00:00 UTC, January 1, 1970). The sleep function suspends evaluation for the given time (integer or floating point value, again in seconds). Be warned that the actual resolution of these functions depends on the timing routines available on your system, and even if a decent resolution is provided, the accuracy of results will depend on various factors such as system load.

The isspecial, isconst, isfun, isvar and isdef predicates are all special forms checking whether their single argument has a certain property. The isspecial function allows you to determine whether its argument is a special form which will receive its next argument unevaluated. The isconst operation checks whether its argument is a constant, i.e., a constant belonging to any of the built-in types, a variable or function symbol declared with const, or an application whose head element is such a constant. The isfun and isvar predicates are used to check whether an expression is a function symbol (built-in or defined by the loaded script, can also be a constant symbol), or a free variable symbol, respectively. The isdef function can be used to check whether its argument (a variable) has been assigned a value using def.

The flip function exchanges arguments of a (binary) function, as if it was defined by the following equation:

flip F X Y              = F Y X;

This function is used internally to implement operator sections with missing left argument. E.g., (+1) is nothing but syntactic sugar for flip (+) 1. You can apply flip to other, user-defined functions as well; thus, flip foo X denotes the function which maps Y to foo Y X.

In order to handle special forms like, e.g., (or else) correctly, flip automagically adjusts to the argument pattern of its first argument. That is, if the function given as the first argument (which is always evaluated) has a special first (or second) argument, then flip's second (or first) argument will be special as well.

[ << ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

This document was generated by Albert Gräf on February, 23 2008 using texi2html 1.76.