| [Top] | [Contents] | [Index] | [ ? ] |
The Q Programming Language, Version 7.11, 23 February 2008.
Copyright (c) 1992-2008 by Albert Gräf.
This manual has been published in the series Musikinformatik und Medientechnik, Bereich Musikinformatik, Musikwissenschaftliches Institut, Johannes Gutenberg-Universitaet Mainz
55099 Mainz
Germany
ISSN 0941-0309
The Q programming system is distributed under the terms of the GNU General Public License. See the software license accompanying the distribution for details.
Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies.
Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one.
Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the author.
This document describes the Q programming language and system, version 7.11, 23 February 2008. Written by Albert Gräf, University of Mainz.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Q stands for "equational", so Q, in a nutshell, is a programming language which lets you "program by equations". You specify a system of equations which the interpreter uses as "rewrite rules" to reduce expressions to "normal form". This allows you to formulate your programs in a high-level, concise and declarative style. Q's development started out in the 1990s, motivated by the author's research on term pattern matching [Gräf 1991] and inspired by the pioneering work on equational programming by Michael O'Donnell and others [O'Donnell 1985], as an attempt to show that term rewriting would provide a feasible basis for a practical programming language. I think that this goal has been achieved, and that the present interpreter is efficient and robust enough for practical purposes. Q has been ported to a variety of operating systems, such as BeOS, FreeBSD, Linux, MacOS X, Solaris and Windows. Porting to other modern UNIX/POSIX environments should be a piece of cake. Thus Q can be used on most modern computer systems, and Q scripts written on one system will usually run on all supported platforms.
The Q language supports a rich variety of built-in types, like arbitrary precision integers, floating point numbers (double precision 64 bit), truth values, strings, tuples, lists, streams (a "lazy" list data structure with call-by-need evaluation), lambda functions and files. It also provides primitives for exception handling and multithreaded execution. Q scripts can be broken down into "modules", each with their separate namespace, and Q gives you full control over which symbols (function, variable and type names) are exported by each module. This makes it easy to organize large scripts in a modular fashion. Q also allows you to interface to "external" modules written in the C programming language, which provides a means to access functions in C and C++ libraries and employ C's higher processing speed for time-critical tasks. Conversely, Q scripts can also be executed from C and C++, which allows Q to be used as an embedded language or term rewriting engine in C/C++ applications.
As a practical programming language, Q comes with "batteries included". The standard library, which is mostly written in Q itself, provides rational and complex numbers, a lot of useful tuple, list and stream processing functions (including comprehensions), some common container data structures (dictionaries, sets, etc.), and an interface to the PostScript language. It also includes an extensive system interface which offers services such as binary and C-style formatted I/O, BSD socket I/O, process management, POSIX threads, regular expression matching and internationalization features. Additional extension modules provide interfaces to a number of other third party libraries, which turns Q into a practical tool for a variety of application areas.
In difference to other functional languages, Q is entirely based on the notions of rewrite rules, reductions and irreducible expressions (also known as normal forms) pertaining to the term rewriting calculus. A Q "program", called a script, consists of equations which are treated as rewrite rules and are used to reduce expressions to normal form. The normal form of an expression denotes its value, which can itself be a compound expression. Q has no rigid distinction between "constructor" and "defined" function symbols and it also allows you to evaluate expressions containing "free" variables. Basically, both sides of an equation may involve arbitrary expressions. Therefore Q can also be used as a tool for symbolic expression evaluation.
On the surface, Q looks very much like contemporary functional languages such as Miranda or Haskell. In fact, the syntax of the language has largely been inspired by the first edition of Introduction to Functional Programming by Richard Bird and Philip Wadler. However, Q is an interpreted language with dynamic typing and eager (leftmost-innermost) evaluation, which is more in line with classical functional languages such as Lisp. For the sake of efficiency, Q scripts are first translated into "bytecode" (an intermediate binary format) which is executed on a virtual stack machine. The interpreter automatically optimizes tail recursion, such that "iterative" algorithms can be realized in constant stack space. Besides the built-in I/O operations, the language is free of side-effects; in particular, Q itself does not have mutable variables (although the standard library provides such imperative features for those who need them). Q also provides two novel and (IMHO) interesting features: a notion of special forms which allows to handle both macro-like functions and lazy evaluation in a uniform setting without having to give up the basic eager evaluation strategy; and a notion of type guards which provides a means to cope with hierarchies of abstract data types (similar to the notion of classes with single inheritance in object-oriented languages) in the context of a term rewriting language.
Using Q is supposed to be fairly simple: you throw together some equations, start the interpreter and then type in the expressions you wish to evaluate. All this can be done with a few keystrokes, if you use the Emacs Q mode supplied with the package. A graphical user interface for Windows is also available. Of course, you can also run the Q programming utilities from the command line if you prefer that.
The manual is organized as follows. In Getting Started, we start out with a brief and informal introduction to the main features of the language, and a glimpse of how Q scripts look like. Lexical Matters, describes the lexical part of the Q language. In Scripts and Modules, we discuss how declarations, definitions and equations are put together to form a script, and how to break down larger scripts into a collection of smaller modules which can be managed separately. Declarations, discusses how certain entities like types, variables and function symbols can be declared in a Q script. Expressions, treats the syntax of Q expressions, and describes the built-in operators provided by the Q language. Equations and Expression Evaluation, is about equations and variable definitions, and how they are used in the evaluation process. Types, and Special Forms, describe the facilities provided by the Q language for dealing with abstract data types and deferred evaluation. Built-In Functions, and The Standard Library, discuss the built-in and standard library functions of the Q language. Clib, describes Q's "system module" which provides access to some important functions from the C library. The appendix gives additional information about some aspects of the language and its implementation. Q Language Grammar, contains a summary of the Q language syntax in BNF. Using Q, provides a description of the programming tools included in the Q programming system, C Language Interface, describes Q's interface to the C programming language, and Debugging, is a brief introduction to the symbolic debugger built into the Q interpreter. Finally, Running Scripts in Emacs, discusses how Q scripts can be edited and run using the Emacs editor.
Acknowledgements
This project has become much further reaching and took a lot longer than I anticipated when I started out working on it, and I wouldn't have been able to keep it going without the patience and support of my beloved wife Evelyn and children Sebastian, Janosch, Yannic and Miriam.
Next I have to acknowledge the support and help provided by colleagues at Mainz and Bremen (Germany) and at INRIA Lorraine (France) during the initial phases of this project. In 1992 Dieter Hofbauer invited me to the term rewriting group at INRIA Lorraine to discuss the design of the (then still nascent) Q with his colleagues and friends. During the following years, I also collaborated with Frank Drewes at Bremen (now at Umea, Sweden) who was involved in the initiation of this project, and Klaus Barthelmann at Mainz who provided many comments which helped to improve the early versions of this manual. Frank and Klaus tested various early beta versions of the Q interpreter, and contributed sample scripts as well as substantial parts of the standard Q library.
I'd also like to express my gratitude to the growing community of users of the Q language all over the world, in particular the members of the Q mailing list, for the lively discussions which helped to improve the language and its implementation, and for the contributed Q modules and applications. Special thanks are due to John Cowan for proofreading the manual, for his valuable suggestions concerning design and implementation issues and for his Q-Chicken interface, and to Rob Hubbard for his comprehensive rational number and polynomial modules which are now part of the distribution.
Last but not least, thanks are also due to the developers of ML and Haskell. While these people have not been involved with Q in any way, their work, which has changed the landscape of functional programming, has been a constant source of inspiration for me.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
A new programming language is best conveyed through some examples, and therefore it is common practice to start a language description with an introductory chapter which treats the most essential language features in a fairly informal manner. This is also the purpose of the present chapter. We first show how some of the "standard features" of the Q programming language can be employed for using the Q interpreter effectively as a sophisticated kind of "desktop calculator". The remainder of this section then adresses the question of how you can extend the Q environment by providing your own definitions in Q scripts, and describes the evaluation process and the treatment of runtime errors in the interpreter.
| 2.1 Using the Interpreter | ||
| 2.2 Using the Standard Library | ||
| 2.3 Writing a Script | ||
| 2.4 Definitions | ||
| 2.5 Runtime Errors |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
To begin with, let us show how the Q interpreter is invoked to evaluate
some expressions with the built-in functions provided by the Q
language. Having installed the Q programming system, you can simply
invoke the interpreter from your shell by typing q. The
interpreter will then start up, display its sign-on, and leaves you at
its prompt:
==> |
This indicates that the interpreter is waiting for you to type an expression. Let's start with a simple one:
==> 23 23 |
Sure enough, that's correct. Whenever you type an expression, the interpreter just prints its value. Any integer is a legal value in the Q language, and the common arithmetic operations work on these values as expected. Q integers are "bignums", i.e., their size is only limited by available memory:
==> 16753418726345 * 991726534256718265234 16614809890429729930396098173389730 |
"Real" numbers (more exactly, their approximation using 64 bit double precision floating point numbers) are provided as well. Let's try these:
==> sqrt (16.3805*5)/.05 181.0 |
The sqrt identifier denotes a built-in function which computes
the square root of its argument. (A summary of the built-in functions
can be found in Built-In Functions.)
What happens if you mistype an expression?
==> sqrt (16.3805*5)/,05
! Syntax error
>>> sqrt (16.3805*5)/,05
^
|
As you can see, the interpreter not only lets you know that you typed
something wrong, but also indicates the position of the error. It is
quite easy to go back now and correct the error, using the interpreter's
"command history". With the up and down arrow keys you can cycle
through the expressions you have typed before, edit them, and resubmit a
line by hitting the carriage return key. Also note that what you typed
is stored in a "history file" when you exit the interpreter, which is
reloaded next time the interpreter is invoked. A number of other useful
keyboard commands are provided, see Readline: (readline)Command Line Editing section `Command Line Editing' in The GNU Readline Library. In
particular, you can have the command line editor "complete" function
symbols with the <TAB> key. E.g., if you type in sq and
press the <TAB> key, you will get the sqrt function.
Before we proceed, a few remarks about the syntax of function applications are in order. The first thing you should note is that in Q, like in most other modern functional languages, function application is simply denoted by juxtaposition:
==> sqrt 2 1.4142135623731 |
Multiple arguments are specified in the same fashion:
==> max 5 7 7 |
Parentheses can be used for grouping expressions as usual. In particular, if a function application appears as an argument in another function application then it must be parenthesized as follows:
==> sqrt (sqrt 2) 1.18920711500272 |
Another important point is that operators are in fact just functions in
disguise. You can turn any operator into a prefix function by enclosing
it in parentheses. Thus (+) denotes the function which adds its
two arguments, and X+1 can also be written as (+) X 1; in
fact, the former expression is nothing but "syntactic sugar" for the
latter, see Built-In Operators. You can easily verify this in the
interpreter:
==> (+) X 1 X+1 |
You can also have partial function applications like (*) 2 which
denotes a function which doubles its argument. Moreover, Q supports
so-called operator sections which allow you to specify a binary
operator with only either its left or right operand. For instance,
(1/) denotes the reciprocal and (+1) the "increment by
1" function:
==> (+1) 5 6 ==> (1/) 3 0.333333333333333 |
The interpreter maintains a global variable environment, in which you
can store arbitrary expression values. This provides a convenient means
to define abbreviations for frequently-used expressions and for storing
intermediate results. Variable definitions are done using the def
command. For instance:
==> def X = 16.3805*5 ==> X 81.9025 |
As indicated, variable symbols usually start with a capital letter; an
initial lowercase letter indicates a function symbol. This convention is
valid throughout the Q language. However, it is possible to explicitly
declare an uncapitalized symbol as a variable, using the var
command, and then assign values to it, like so:
==> var f ==> def f = sqrt |
You can also both declare a variable and initialize its value with a
single var command:
==> var f = sqrt ==> f X/.05 181.0 |
Multiple variable declarations and definitions can be entered on a
single line, using commas to separate different definitions in a
def or var command, and you can separate multiple
expressions and commands on the same line with semicolons:
==> def X = 16.3805*5, f = sqrt; X; f X/.05 81.9025 181.0 |
You can also bind several variables at once by using an expression pattern as the left-hand side of a variable definition:
==> def (f, X) = (sqrt, 16.3805*5); f X/.05 181.0 |
(Such pattern-matching definitions only work with def; the
left-hand side of a var declaration must always be a simple
identifier.)
Another useful feature is the built-in "anonymous" variable `_', which is always set to the value of the most recent expression value printed by the interpreter:
==> _ 181.0 ==> 2*_ 362.0 |
Sometimes you would also like the interpreter to "forget" about a
definition. This can be done by means of an undef statement:
==> undef X; X X |
Besides def, undef and var, the interpreter
provides a number of other special commands. The most important command
for beginners certainly is the help command, which displays the
online manual using the GNU info reader. You can also run this command
with a keyword to be searched in the info file. For instance, to find
out about all special commands provided by the interpreter, type the
following:
==> help commands |
(Type q when you are done reading the info file.)
Other useful commands are who which prints a list of the
user-defined variables, and whos which describes the attributes
of a symbol:
==> who
f
==> whos f sqrt
f user-defined variable symbol
= sqrt
sqrt builtin function symbol
sqrt X1
|
You can save user-defined variables and reload them using the
save and load commands:
==> save saving .q_vars ==> def f = 0; f 0 ==> load loading .q_vars ==> f sqrt |
Some statistics about the most recent expression evaluation can be
printed with the stats command:
==> sum [1..123456] 7620753696 ==> stats 0.52 secs, 246916 reductions, 246918 cells |
The stats command displays the (cpu) time needed to complete an
evaluation, the number of "reductions" (a.k.a. basic evaluation steps)
performed by the interpreter, and the number of expression "cells"
needed during the course of the computation. This information is often
useful for profiling purposes.
Other commands allow you to edit and run a script directly from the interpreter, and to inspect and set various internal parameters of the interpreter; see Command Language.
Of course, the Q interpreter can also carry out computations on non-numeric data. In fact, Q provides a fairly rich collection of built-in data types, and makes it easy to define your own. For instance, strings are denoted by character sequences enclosed in double quotes. Strings can be concatenated with the `++' operator and the length of a string can be determined with `#'. Moreover, the character at a given position can be retrieved with the (zero-based) indexing operator `!':
==> "abc"++"xyz"; #"abc"; "abc"!1 "abcxyz" 3 "b" |
The same operations also apply to lists, which are written as sequences of values enclosed in brackets:
==> [a,b,c]++[x,y,z]; #[a,b,c]; [a,b,c]!1 [a,b,c,x,y,z] 3 b |
As in Prolog, lists are actually represented as right-recursive
constructs of the form `[X|Xs]' where X denotes the head
element of the list and Xs its tail, i.e., the list of the
remaining elements. This representation allows a list to be traversed
and manipulated in an efficient manner.
==> def [X|Xs] = [a,b,c]; X; Xs a [b,c] ==> [0|Xs] [0,b,c] |
Another useful list operation is "enumeration", which works with all
so-called "enumeration types" (cf. Enumeration Types), as well
as characters and numbers. Enumerations are constructed with the
enum function, or using the familiar [X..Y] notation which
is in fact only "syntactic sugar" for enum:
==> enum "a" "k" ["a","b","c","d","e","f","g","h","i","j","k"] ==> ["a".."k"] ["a","b","c","d","e","f","g","h","i","j","k"] ==> [0..9] [0,1,2,3,4,5,6,7,8,9] ==> [0.1,0.2..1.0] [0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0] |
Tuples are sequences of expressions enclosed in parentheses, which are Q's equivalent of "vectors" or "arrays" in other languages. Like lists, tuples can be concatenated, indexed and enumerated, but they use an internal representation optimized for efficient storage and constant-time indexing:
==> (a,b,c)++(x,y,z); #(a,b,c); (a,b,c)!1 (a,b,c,x,y,z) 3 b ==> (0..9) (0,1,2,3,4,5,6,7,8,9) |
Q also offers built-in operations to convert between lists and tuples:
==> tuple [a,b,c] (a,b,c) ==> list _ [a,b,c] |
Q provides yet another list-like data structure, namely "lazy" lists a.k.a. streams, which are written using curly braces instead of brackets. In difference to lists, streams only produce their elements "on demand", when they are needed in the course of a computation. This makes it possible to represent large and even infinite sequences of values in an efficient manner. You can see this somewhat unusual behaviour if you try streams interactively in the interpreter. For instance, let's try to concatenate two streams:
==> {a,b,c}++{x,y,z}
{a|{b,c}++{x,y,z}}
|
You probably noticed that the tail of the resulting stream,
{b,c}++{x,y,z}, has not been evaluated yet. But that is all
right, because it will be evaluated as soon as we need it:
==> _!4 y |
This lazy way of doing things becomes especially important when we work with infinite streams. We will have more to say about this in the following section.
It is quite instructive to keep on playing a bit like this, to get a feeling
of what the Q interpreter can do. You might also wish to consult
Built-In Functions, which discusses the built-in functions, and
Using Q, which shows how the interpreter is invoked and what additional
capabilities it offers. To exit the interpreter when you are finished,
invoke the built-in quit function:
==> quit |
This function does not return a value, but immediately returns you to the operating system's command shell.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The standard library is a collection of useful functions and data
structures which are not provided as built-ins, but are
implemented by scripts which are mostly written in Q itself. Most of
these scripts are included in the "prelude script" prelude.q,
which is always loaded by the interpreter, so that the standard library
functions are always available. See The Standard Library, for an
overview of these operations. You can check which standard library
scripts a.k.a. "modules" are actually loaded with the interpreter's
modules command. With the standard prelude loaded, this command
shows the following:
==> modules assert clib* complex cond error math prelude rational sort stdlib stream string tuple typec |
As you can see, there are quite a few modules already loaded, and you can use the functions provided by these modules just like any of the built-in operations. The standard library provides a lot of additional functions operating on numbers, strings and lists. For instance, you can take the sum of a list of (integer or floating point) numbers simply as follows:
==> sum [1..5] 15 |
In fact, the library provides a rather general operation, foldl,
which iterates a binary function over a list, starting from a given initial
value. Using the foldl function, the above example can also be
written as follows:
==> foldl (+) 0 [1..5] 15 |
(There also is a foldr function which works analogously, but
combines list members from right to left rather than from left to
right.)
Generalizing the notion of simple enumerations of numbers like
[1..5], the standard library also provides the general-purpose
list-generating function while. For instance, we can generate the
list of all powers of 2 in the range [1..1000] as follows:
==> while (<=1000) (2*) 1 [1,2,4,8,16,32,64,128,256,512] |
(Recall from the previous section that (2*) is the doubling
function. And (<=1000) is a predicate which checks its argument
to be less than or equal to 1000.)
If we want to generate a list of a given size, we can use iter
instead. So, for instance, we might compute the value of a finite
geometric series as follows:
==> sum (iter 4 (/3) 1) 1.48148148148148 |
The map function allows you to apply a function to every member of a
list. For instance, the following expression doubles each value in the list
[1..5]:
==> map (2*) [1..5] [2,4,6,8,10] |
Lists can also be filtered with a given predicate:
==> filter (>=3) [1..5] [3,4,5] |
The scanl operation allows you to compute all the sums of initial
segments of a list (or accumulate any other binary operation over a
list):
==> scanl (+) 0 [1..5] [0,1,3,6,10,15] |
(Like foldl, scanl also has a sibling called scanr
which collects results from right to left rather than left to right,
starting at the end of the list.)
You can partition a list into an initial part with a given length and
the remaining part of the list with take and drop:
==> take 3 [1..5]; drop 3 [1..5] [1,2,3] [4,5] |
The takewhile and dropwhile functions have a similar
effect, but partition their input list according to a given predicate:
==> takewhile (<=3) [1..5]; dropwhile (<=3) [1..5] [1,2,3] [4,5] |
Another useful list operation is zip which collects pairs of
corresponding elements in two input lists:
==> zip [1..5] ["a".."e"] [(1,"a"),(2,"b"),(3,"c"),(4,"d"),(5,"e")] |
The effect of zip can be undone with unzip which returns a
pair of lists:
==> unzip _ ([1,2,3,4,5],["a","b","c","d","e"]) |
The zipwith function is a generalized version of zip which
combines corresponding members from two lists using a given binary
function:
==> zipwith (*) [1..5] [1..5] [1,4,9,16,25] |
All the operations described above - and many others - are provided by
the stdlib.q script. It is instructive to take a look at this
script and see how the operations are defined.
In addition, the standard library provides yet another list generating
function listof which implements so-called "list
comprehensions". These allow you to describe list values in much the
same way as sets are commonly specified in mathematics. For instance,
you can generate a list of pairs (I,J) with 1<=J<I<=5 as
follows:
==> listof (I,J) (I in [1..5], J in [1..I-1]) [(2,1),(3,1),(3,2),(4,1),(4,2),(4,3),(5,1),(5,2),(5,3),(5,4)] |
The language provides syntactic sugar for list comprehensions so that the above example can also be written as follows:
==> [(I,J) : I in [1..5], J in [1..I-1]] [(2,1),(3,1),(3,2),(4,1),(4,2),(4,3),(5,1),(5,2),(5,3),(5,4)] |
The same kind of construct also works with tuples:
==> ((I,J) : I in [1..5], J in [1..I-1]) ((2,1),(3,1),(3,2),(4,1),(4,2),(4,3),(5,1),(5,2),(5,3),(5,4)) |
We also have a random number generator, which is implemented by the
built-in random function. Here is how we can generate a list of 5
pseudo random 32 bit integers:
==> [random : I in [1..5]] [1960913167,1769592841,3443410988,2545648850,536988551] |
To get random floating point values in the range [0,1] instead, we
simply divide the results of random by 0xffffffff:
==> map (/0xffffffff) _ [0.456560674928259,0.41201544027124,0.801731596887515,0.592705060400233, 0.125027389993199] |
Lists can be sorted using quicksort (this one comes from sort.q):
==> qsort (<) _ [0.125027389993199,0.41201544027124,0.456560674928259,0.592705060400233, 0.801731596887515] |
As already mentioned, the Q language also provides a "lazy" list data
structure called "streams". Streams are like lists but can actually be
infinite because their elements are only produced "on demand". The
standard library includes a module stream.q which implements a
lot of useful stream operations. Most list operations carry over to
streams accordingly. For instance, if we want to create a geometric
series like the one generated with iter above, but we do not know
how many elements will be needed in advance, we can employ the stream
generation function iterate:
==> iterate (/3) 1
{1|iterate (/3) ((/3) 1)}
|
The {|} stream constructor works in the same way as the
[|] list constructor, but is "special" in that it does
not evaluate its arguments, i.e., the head and the tail of the
stream. (Otherwise the call to iterate would never terminate.)
To get all members of the series we can apply the scanl function:
==> def S = scanl (+) 0 _; S
{0|scanl (+) (0+1) (iterate (/3) ((/3) 1))}
|
Now we can extract any number of initial values of the series using the
take operation, and convert the resulting stream to an ordinary
list with the list function. For instance, if we are interested
in the first five values of the series, we proceed as follows:
==> list (take 5 S) [0,1,1.33333333333333,1.44444444444444,1.48148148148148] |
Note that the stream S is really infinite; if we want, we can
extract any value in the series:
==> S!9999 1.5 |
Let's see how many iterations are actually required to reach the limit
1.5 with an error of at most 1e-15:
==> #takewhile (>1e-15) (map (1.5-) S) 32 |
This means that the sum of the first 31 series terms is needed to get an
accuracy of 15 digits (which is the best that we can hope for with 64
bit floating point values). We can readily verify this using
iter:
==> sum (iter 31 (/3) 1) 1.5 |
The standard library also implements stream enumerations and
comprehensions which work like the corresponding list and tuple
constructs but are evaluated in a lazy fashion, and the interpreter
provides syntactic sugar for these. In particular, the notation
{X..} or {X,Y..} can be used to specify an infinite
arithmetic sequence. For instance, here is another way to define the
infinite geometric series from above:
==> def S = scanl (+) 0 {1/3^N : N in {0..}}
==> list (take 5 S)
[0,1.0,1.33333333333333,1.44444444444444,1.48148148148148]
|
It is worth noting here that streams are just one simple example of a lazy data structure, which happen to play an important role in many useful algorithms so that it makes sense to provide them as a builtin. The Q language makes it rather easy to define your own lazy data structures using so-called special forms. In fact, special forms provide a uniform means to define both lazy data constructors and macro-like functions which take their arguments unevaluated, employing "call by name" parameter passing. This will be discussed in detail in Special Forms.
Special forms are an integrated part of Q which adds a great amount of
expressive power to the language. In particular, they allow specialized
constructs such as a "short-circuit" conditional expressions to be
defined in the standard library, just like "ordinary" functions,
rather than having to provide them as builtins. One example is the
conditional expression if X then Y else Z which is nothing but
syntactic sugar for the standard library function ifelse:
==> ifelse (5>0) "positive" "negative" "positive" ==> if 5>0 then "positive" else "negative" "positive" |
Another special form which is useful in many situations is lambda
(as of Q 7.1, this one is actually provided as a builtin), which allows
you to create little "anonymous" functions on the fly. The customary
notation \X.Y is provided as a shorthand for lambda X
Y. These constructs are also called lambda abstractions, or
simply lambdas. For instance, the following interpreter command
defines the well-known factorial function using a lambda abstraction and
assigns the resulting function object to the variable fac:
==> var fac = \N.if N>0 then N*fac (N-1) else 1 ==> fac \X1 . if X1>0 then X1*fac (X1-1) else 1 ==> map fac [1..10] [1,2,6,24,120,720,5040,40320,362880,3628800] |
Lambdas taking multiple arguments can be created like this:
==> \X Y.(1-X)*Y \X1 . \X2 . (1-X1)*X2 ==> _ 0.9 0.5 0.05 |
The lambda arguments can also be patterns to be matched against the actual parameters. For instance:
==> \(X,Y).(1-X)*Y \(X1,X2) . (1-X1)*X2 ==> _ (0.9,0.5) 0.05 |
Besides string and list processing functions and general utility
functions of the kind sketched out above, the standard library also
includes a collection of efficient "container" data structures which
are useful in many applications. For instance, so-called hashed
dictionaries (also known as "hashes" or "associative arrays") are
implemented by the hdict.q script. The following expressions show
how to create a dictionary object from a list, and how to access this
dictionary using the index (!), keys and vals
operations. (Note that here we have to load the hdict.q module
explicitly, as it is not automatically included via the prelude. We
could also import the stdtypes.q module instead, to get hold of
all the container types.)
==> import hdict ==> def D = hdict [(foo,99),(bar,-1),(gnu,foo)]; D!gnu foo ==> keys D [foo,bar,gnu] ==> vals D [99,-1,foo] |
This completes our little tour through the standard library. To find out more, please refer to The Standard Library, or take a look at the scripts themselves.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Now that we have taken the first big hurdle and are confident that we can actually get the Q interpreter to evaluate us some expressions, let us try to define a new function for use in the interpreter. That is, we are going to write our first own Q "program" or "script".
In the Q language, there are actually two different ways to create new functions. As we have already seen in the previous sections, little "anonymous" functions are often computed on the fly by deriving them from existing functions using partial applications or lambda abstractions. The other way, which is often more convenient when the functions to be defined get more complicated, is the definition of named functions by the means of "equations", which works in a way similar to algebraic definitions in mathematics. We will discuss how to do this in the following.
Having exited the Q interpreter, invoke your favourite text editor and enter the following simple script which implements the factorial function:
fac N = N*fac(N-1) if N>0;
= 1 otherwise;
|
(Note that in order to define functions equationally, you really have to write a script file. For technical reasons there currently is no way to enter an equational definition directly in the interpreter.)
Save the script in the file fac.q and then restart the
interpreter as follows (here and in the following `$' denotes the
command shell prompt):
$ q fac.q |
You can also edit the script from within the interpreter (using vi or
the editor named by the EDITOR environment variable) and then restart
the interpreter with the new script, as follows:
==> edit fac.q ==> run fac.q |
In any case, now the interpreter should know about the definition of
fac and we can use it like any of the built-in operations:
==> map fac [1..10] [1,2,6,24,120,720,5040,40320,362880,3628800] |
For instance, what is the number of 30-element subsets of a 100-element set?
==> fac 100 div (fac 30*fac 70) 29372339821610944823963760 |
As another example, let us write down Newton's algorithm for computing
roots of a function. Type in the following script and save it to the
file newton.q:
newton DX DY F = until (satis DY F) (improve DX F); satis DY F X = abs (F X) < DY; improve DX F X = X - F X / derive DX F X; derive DX F X = (F (X+DX) - F X) / DX; |
Restart the interpreter with the newton.q script and try the
following. Note that the target function here is \Y.Y^3-X which
becomes zero when Y equals the cube root of X.
==> var eps = .0001, cubrt = \X.newton eps eps (\Y.Y^3-X) X ==> cubrt 8 2.00000000344216 |
Well, this is not that precise, but we can do better:
==> def eps = 1e-12 ==> cubrt 8 2.0 |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
As mentioned in the previous section, in the Q language function definitions take the form of equations which specify how a given expression pattern, the left-hand side of the equation, is transformed into a new expression, the right-hand side of the equation. In this section we elaborate on this some more to give you an idea about how these equational definitions work. Our explanations will necessarily be a bit sketchy at this point, but the concepts we briefly touch on in this section will be explained in much more detail later.
The left-hand side of an equation may introduce "variables" (denoted
by capitalized identifiers, as in Prolog) which play the role of formal
parameters in conventional programming languages. For instance, the
following equation defines a function sqr which squares its
(numeric) argument:
sqr X = X*X; |
An equation may also include a condition part, as in the definition of the factorial function from the previous section:
fac N = N*fac(N-1) if N>0;
= 1 otherwise;
|
As indicated above, several equations for the same left-hand side can be
factored to the left, omitting repetition of the left-hand side.
Furthermore, the otherwise keyword may be used to denote the
"default" alternative.
The left-hand side of an equation may actually be an arbitrary compound
expression pattern to be matched in the expression to be evaluated. For
instance, as we have already seen, the expressions [] and
[X|Xs] denote, respectively, the empty list and a list starting
with head element X and continuing with a list Xs of
remaining elements, just as in Prolog. We can use these patterns to
define a function add which adds up the members of a list as
follows:
add [] = 0; add [X|Xs] = X+add Xs; |
Thus no explicit operations for "extracting" the members from a compound data object such as a list are required; the components are simply retrieved by "pattern matching". This works in variable definitions as well. For instance, in the interpreter you can bind variables to the head element and the tail of a list using a single definition as follows:
==> def [X|Xs] = [1,2,3]; X; Xs 1 [2,3] |
Pattern matching also works if the arguments take the form of function applications. For instance, the following definition implements an insertion operation on binary trees:
insert nil Y = bin Y nil nil;
insert (bin X T1 T2) Y = bin Y T1 T2 if X=Y;
= bin X (insert T1 Y) T2 if X>Y;
= bin X T1 (insert T2 Y) if X<Y;
|
Note the symbols nil and bin which act as "constructors"
for the binary tree data structure in this example. (Q actually has no
rigid distinction between "function applications" and "data
constructors". This is unnecessary since function applications can be
"values" in their own right, as discussed below.)
The Q interpreter treats equations as "rewrite rules" which are used to reduce expressions to "normal forms". Evaluation normally uses the standard "leftmost-innermost" rule, also known as "applicative order" or "call by value" evaluation. Using this strategy, expressions are evaluated from left to right, innermost expressions first; thus the arguments of a function application (and also the function object itself) are evaluated before the function is applied to its arguments.
An expression is in normal form if it cannot be evaluated any further by applying matching equations (including some predefined rules which are used to implement the built-in operations, see Equations and Expression Evaluation). A normal form expression simply stands for itself, and constitutes a "value" in the Q language. The built-in objects of the Q language (such as numbers and strings) are always in normal form, but also compound objects like lists (given that their elements are in normal form) and any other symbol or function application which does not match a built-in or user-defined equation.
This means that Q is essentially an "exception-free" language. That is, an "error condition" such as "wrong argument type" does not raise an exception by default, but simply causes the offending expression to be returned "as is". For instance:
==> hd [] hd [] ==> sin (X+1) sin (X+1) |
You may be disconcerted by the fact that expressions like hd []
and sin (X+1) actually denote legal values in the Q
language. However, this is one of Q's key features as a term rewriting
programming language, and it adds a great amount of flexibility to the
language. Most importantly, it allows expressions, even expressions
involving variables, to be evaluated in a symbolic fashion. For
instance, you might add rules for algebraic identities and symbolic
differentiation to your script and have the interpreter simplify
expressions according to these rules. On the other hand, the Q language
also allows you to refine the definition of any built-in or user-defined
operation and thus it is a simple matter to add an "error rule" like
the following to your script when needed:
hd [] = error "hd: empty list"; |
(The error function is defined in the standard library script
error.q, see Diagnostics and Error Messages.) In order to
implement more elaborate error handling, you can also use Q's built-in
throw and catch functions to raise and respond to
exceptions on certain error conditions. This is discussed in
Exception Handling.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
There are a few error conditions which may cause the Q interpreter to
abort the evaluation with a runtime error message. A runtime error
arises when the interpreter runs out of memory or stack space, when the
user aborts an evaluation with the interrupt key (usually Ctl-C),
and when the condition part of an equation does not evaluate to a truth
value (true or false). (All these error conditions can
also be handled by the executing script, see Exception Handling.)
You can use the break on command to tell the interpreter that it
should fire up the debugger when one of the latter two conditions
arises. After Ctl-C, you can then resume evaluation in the
debugger, see Debugging. This is not possible if the interpreter
stumbles across an invalid condition part; in this case the debugger is
invoked merely to report the equation which caused the error, and to
inspect the call chain of the offending rule. Hitting the carriage
return key will return you to the interpreter's evaluation loop.
For instance, reload the fac.q script from Writing a Script:
fac N = N*fac(N-1) if N>0;
= 1 otherwise;
|
Now enable the debugger with the break command and type some
"garbage" like fac fac:
==> break on; fac fac ! Error in conditional 0> fac.q, line 1: fac fac ==> fac*fac (fac-1) if fac>0 (type ? for help) : |
What happened? Well, the debugger informs us that the error occurred when the interpreter tried to apply the first equation,
fac N = N*fac(N-1) if N>0; |
to the expression fac fac. In fact this is no big surprise since
we cannot expect the interpreter to know how to compare a symbol,
fac, with a number, 0, which it tried when evaluating the
condition part of the above equation. So let us return to the
interpreter's prompt by hitting the carriage return key:
: <CR> ==> |
The interpreter now waits for us to type the next expression. (For a summary
of debugger commands please refer to Debugging. You can also
type ? or help while in the debugger to obtain a list of
available commands.)
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The vocabulary of the Q programming language consists of notations for identifiers, operators, integers, floating point numbers, character strings, comments, a few reserved words which may not be used as identifiers, and some special punctuation symbols which are used as delimiters.
| 3.1 Whitespace and Comments | ||
| 3.2 Identifiers and Reserved Words | ||
| 3.3 Operator Symbols | ||
| 3.4 Numbers | ||
| 3.5 Strings | ||
| 3.6 Unicode Support |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Whitespace (blanks, tabs, newlines, form feeds) serves as a delimiter between adjacent symbols, but is otherwise ignored. Comments are treated like whitespace:
/* This is a comment ... */ |
Note that these comments cannot be nested. C++-style line-oriented comments are also supported:
// C++-style comment ... |
Furthermore, lines beginning with the #! symbol denote a special
type of comment which may be processed by the operating system's command
shell and the Q programming tools. On UNIX systems, this (odd) feature
allows you to execute Q scripts directly from the shell (by specifying
the q program as a command language processor) and to include
compiler and interpreter command line options in a script (see
Running Scripts from the Shell).
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Identifiers are denoted by the usual sequences of letters (including
`_') and digits, beginning with a letter. Upper- and lowercase is
distinct. In the Q language, identifiers are used to denote type,
function and variable symbols. Type identifiers may start
with either an uppercase or lowercase letter (the convention, however,
is to use an initial uppercase letter). The case of the first letter
matters, however, for function and variable symbols. As in Prolog, a
capitalized identifier (such as X, Xmax and XMAX)
indicates a variable symbol, whereas identifiers starting with a
lowercase letter denote function symbols (unless they are declared as
"free" variables, see below). Unlike in Prolog, the underscore
`_' counts as a lowercase letter, hence _MAX is a
function symbol, not a variable. (The idea behind this is that it allows
you to get a function symbol which appears to start with an uppercase
letter by stropping it with an initial `_'.) However, as an
exception to the general rule, the identifier `_' does
denote a variable symbol, the so-called anonymous variable. The
same rules also apply to symbols created interactively in the
interpreter.
Variables actually come in two flavours: bound and free variables, i.e., variables which also occur on the left-hand side of an equation, and variables which only occur on the right-hand side and/or in the condition part of an equation. Identifiers may also be declared as free variables; see Declarations. In this case, they may also start with a lowercase letter.
Type, function and free variable identifiers may also be qualified
with a module identifier prefix (cf. Scripts and Modules),
to specifically denote a symbol of the given module. Such a qualified
identifier takes the form module-id::identifier; no
whitespace or comments are allowed between the module name, `::'
symbol and the function or variable identifier.
Formally, the syntax of identifiers is described by the following grammatical rules:
identifier : unqualified-identifier
| qualified-identifier
qualified-identifier : module-identifier '::'
unqualified-identifier
unqualified-identifier : variable-identifier
| function-identifier
| type-identifier
module-identifier : letter {letter|digitsym}
type-identifier : letter {letter|digitsym}
variable-identifier : uppercase-letter {letter|digitsym}
| '_'
function-identifier : lowercase-letter {letter|digitsym}
letter : uppercase-letter|lowercase-letter
uppercase-letter : 'A'|...|'Z'|uppercase unicode letter
lowercase-letter : 'a'|...|'z'|'_'|lowercase unicode letter
digitsym : '0'|...|'9'|unicode digit
|
(Please refer to Q Language Grammar, for a description of the BNF grammar notation used throughout this document.)
The following symbols are reserved words of the Q language and may not be used as identifiers:
as const def else extern from if import include otherwise private public special then type undef var virtual where |
In addition, the following identifiers are predeclared as operator symbols (see Operator Symbols) and cannot be used as normal identifiers either:
and div mod not or |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Operator symbols may either take the form of function identifiers or they may be sequences of punctuation characters (excluding `_' which serves as a letter in the Q language). Like function and free variable symbols, they may be qualified with a module identifier prefix:
op : unary-op|binary-op
unary-op : opsym
binary-op : opsym|'and' 'then'|'or' 'else'
opsym : unqualified-opsym
| qualified-opsym
qualified-opsym : module-identifier '::'
unqualified-opsym
unqualified-opsym : function-identifier
| punctsym {punctsym}
punctsym : unicode punctuation character
|
In either case, operator symbols must be declared explicitly before they can be used, cf. Declarations. The declaration determines the precedence and "fixity" of the operator (i.e., whether it acts as a unary prefix or a binary infix operator; see User-Defined Operators, for more information on this).
As already mentioned, the following identifiers are predefined as built-in operators:
and div mod not or |
As indicated by the syntax rules, the logical connectives `and' and `or' can also be combined with the keywords `then' and `else' to form the "short-circuit" connectives `and then' and `or else'.
The punctuation symbols predefined as built-in operators are the following:
` ' ~ & . || < > = <= >= <> == ++ + - * / ^ ! # $ |
Most of these can actually be redeclared by the programmer for his own purposes. However, there are a few symbols, called "soft delimiters", which play a special role in the syntax of the Q language (some of these are also used as operator symbols):
~ . .. : | = == - \ @ |
The soft delimiters may occur inside user-defined operators, but if they are used as separate lexemes then they are treated like reserved words and thus they may not be declared as operator symbols (except for the purpose of making aliases, cf. Declarations). The same applies to the reserved keywords (see Identifiers and Reserved Words).
Moreover, the following special symbols serve as "hard delimiters" in the Q language which always separate lexemes and thus may not occur inside operator symbols at all:
" , ; :: ( ) [ ] { }
|
The same applies to whitespace and other non-printable characters, as well as the comment delimiters (`//' and `/*' as well as initial `#!' on a script line).
Symbols consisting of punctuation are generally parsed using the "longest possible lexeme" a.k.a. "maximal munch" rule. Here, the "longest possible lexeme" refers to the longest prefix of the input such that the sequence of punctuation characters either forms a valid (i.e., declared) operator symbol, or one of the reserved and special delimiter symbols listed above. Thus, e.g., `..#' will usually be parsed as `.. #', i.e., a reserved `..' symbol followed by a `#' operator. This holds unless the entire sequence `..#' has already been declared as an operator in the scope where it is used.
The only exception to the "maximal munch" rule occurs inside declarations where a new operator symbol is being introduced. In this case the symbol extends up to the next hard delimiter symbol (usually `)' or `;') or whitespace/non-printable character in the input. For instance, the following Q code snippet declares a new binary operator symbol `+~%' (please refer to Declarations, for an explanation of the declaration syntax):
public (+~%) X Y; |
Another special case arises with conglomerates of three or more
consecutive `:' symbols. In this case the initial `::' is
always treated as the designation of a qualified (operator) symbol. (In
the unlikely case that you ever need to specify a "guarded variable"
of the form X : ::Type with a qualified type in the built-in
namespace, the space between the `:' token and the following
`::' qualification designator is mandatory.)
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Signed decimal numeric constants are denoted by sequences of decimal digits (only the standard digits 0..9 in the ASCII character set may be used here) and may contain a decimal point and/or a scaling factor. Integers can also be denoted in octal or hexadecimal, using the same syntax as in C:
number : ['-'] unsigned-number
unsigned-number : '0' octdigitseq
| '0x' hexdigitseq
| '0X' hexdigitseq
| digitseq ['.' [digitseq]] [scalefact]
| [digitseq] '.' digitseq [scalefact]
digitseq : digit {digit}
octdigitseq : octdigit {octdigit}
hexdigitseq : hexdigit {hexdigit}
scalefact : 'E' ['-'] digitseq
| 'e' ['-'] digitseq
digit : '0'|...|'9'
octdigit : '0'|...|'7'
hexdigit : '0'|...|'9'|'a'|...|'f'|'A'|...|'F'
|
Simple digit sequences without decimal point and scaling factor are treated as integers; if the sequence starts with `0' or `0x'/`0X' then it denotes an integer in octal or hexadecimal base, respectively. Other numbers denote (decimal) floating point values. If a decimal point is present, it must be preceded or followed by at least one digit. Both the scaling factor and the number itself may be prefixed with a minus sign. (Syntactically, the minus sign in front of a number is interpreted as unary minus, cf. Expressions. However, if unary minus occurs in front of a number, it is interpreted as a part of the number and is taken to denote a negative value. See the remarks concerning unary minus in Expressions.) Some examples:
0 -187326 0.0 -.05 3.1415e3 -1E-10 0177 0xaf -0XFFFF |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
String constants are written as character sequences enclosed in double quotes:
string : '"' {char} '"'
char : any character but newline and |
To include newlines, double quotes and other (non-printable) characters in a string, the following escape sequences may be used:
\n newline \r carriage return \t tab \b backspace \f form feed \" double quote \\ backslash |
Furthermore, a character may also be denoted in the form \N,
where N is the character number in decimal, hexadecimal or octal
(using the same syntax as for unsigned integer values). Note that the
character number may consist of an arbitrary number of digits; the
resulting value will be taken modulo the size of the local character
set, which is 256 in the case of ASCII-only systems and 0x110000 if the
interpreter has been built with Unicode support (see Unicode Support). Optionally, the character number may also be enclosed in
parentheses; this makes it possible to specify a string in which a
character escape is immediately followed by another character which
happens to be a valid digit, as in "\(123)4".
As of version 7.11 and later, the interpreter also supports symbolic character escapes of the form `\&name;', where name is any of the XML single character entity names specified in the "XML Entity definitions for Characters", see http://www.w3.org/TR/xml-entity-names/. Note that, at the time of this writing, this is still a W3C working draft, so the supported entity names may be subject to change until the final specification comes out; the currently supported entities are described in the draft from 14 December 2007, see http://www.w3.org/TR/2007/WD-xml-entity-names-20071214/. Also note that multi-character entities are not supported in this implementation.
A string may also be continued across lines by putting the \
character immediately before the end of the line, which causes the
following newline character to be ignored.
As of Q 7.0, it is a syntax error if a character escape is not recognized as either a numeric escape or one of the special escapes listed above.
Some examples:
"" empty string "A" single character string "\27" ASCII escape character (ASCII 27) "\033" same in octal "\0x1b" same in hexadecimal "\(0x1b)c" ASCII escape followed by a literal `c' character "Gr\äf" German umlaut, using XML entity escape "a string" multiple character string "a \"quoted\" string" include double quotes "a line\n" include newline "a very \ continue across line end long line\n" |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Since version 7.0 the Q interpreter has full Unicode support. This means that identifiers may actually contain arbitrary alphanumeric letter and digit symbols from the entire Unicode character set. (All non-uppercase alphabetical letters are considered as lowercase letters which may begin a function identifier.) Likewise, operator symbols may consist of arbitrary punctuation characters in the Unicode character set. (We refrain from giving any concrete examples here, to keep this document 7-bit clean.)
Moreover, all string values are internally represented using the UTF-8
encoding, which allows you to represent all characters in the Unicode
character set, while retaining backward compatibility with 7 bit
ASCII. String literals in the source script will be translated from the
system encoding to UTF-8 automatically and transparently, and the
\N notation may specify any valid Unicode point. For this to
work, the interpreter must have been built with Unicode support enabled
(which is the default).
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The basic compilation unit in the Q language is the script, which is simply a (possibly empty) sequence of declarations and definitions in a single source file:
script : {declaration|definition}
|
In order to make a given set of definitions available for use in the interpreter, you can just collect the relevant declarations, variable definitions and equations in a script which you submit to the interpreter for execution. The interpreter in turn invokes the compiler to translate the script into a bytecode file which is loaded by the interpreter (see Using Q). A script may also be empty, in which case it does not provide any definitions at all.
If you put all your definitions into a single script, that's all there is to it. However, you will often want to break down a larger script into a collection of smaller units which can be managed separately and which are linked together by the compiler into a single bytecode file. To these ends the Q language allows you to put your definitions into several separate script files, which are also called modules. To gain access to the function, variable and type symbols provided by another module, you must then use an import or include declaration, using the following syntax:
declaration : unqualified-import
| qualified-import
unqualified-import : 'import' module-spec {',' module-spec} ';'
| 'include' module-spec {',' module-spec} ';'
qualified-import : 'from' module-spec 'import' [symbol-specs] ';'
| 'from' module-spec 'include' [symbol-specs] ';'
module-spec : module-name ['as' unqualified-identifier]
module-name : unqualified-identifier
| string
symbol-specs : symbol-spec {',' symbol-spec}
symbol-spec : unqualified-identifier ['as' unqualified-identifier]
| unqualified-opsym ['as' unqualified-opsym]
|
As of version 7.8, Q supports both unqualified and qualified import clauses. The former allows you to quickly import an entire collection of modules, while the latter gives you precise control over which symbols are to be imported from which modules. We describe each of these in turn.
| 4.1 Unqualified Imports | ||
| 4.2 Qualified Imports | ||
| 4.3 Implicit Imports and the Prelude | ||
| 4.4 The Global Namespace |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Easiest first, let as take a look at unqualified imports. The following
declaration imports two modules foo and bar:
import foo, bar; |
If you put such declarations into your "main script" and then run the script with the interpreter, the imported modules will be loaded as well.
To determine which functions, variables and types are actually available to be imported into another script, the Q language allows you to declare symbols either as "public" or "private", see Declarations. Outside a module, only the public symbols are accessible when the module is imported in another module.
Normally import is not "transitive", i.e., importing a module
foo does not automatically give you access to the symbols
imported by foo, but only to the public symbols declared in
foo itself. To work around this, instead of import you
can also use an include declaration which causes all imports and
includes of the included module to be "reexported". For instance, let
us consider the case that module foo includes module bar:
include bar; |
Then by importing module foo you also gain access to the public
symbols of module bar (and, recursively, to the modules included
by bar, etc.). This provides a means to group different modules
together in one larger "umbrella" module. For instance, the standard
prelude script (cf. The Standard Library) simply includes most of
the other standard library modules.
In the Q language, each module has its own separate namespace. This
means that two different modules, say foo1 and foo2, may
both provide their own public function symbol named foo. If both
foo1 and foo2 are imported in the same script, you can
distinguish between the two by using a qualified identifier, using
the module name as a prefix, e.g., foo1::foo or
foo2::foo. (Writing simply foo in this case will produce
an error message because the reference is ambiguous.)
Import and include declarations can occur anywhere in a script (and will
be active from this point on up to the end of the script file), but it
is common (and recommended) practice to put them near the beginning of
the script. As in most other programming languages, module dependencies
must always be acyclic. If the compiler finds a cyclic chain of
import and include declarations, such as a module
importing itself or a module foo importing a module bar
which in turn imports foo again, it produces an error message.
Another caveat: When using unqualified import clauses, it is much too
easy to accidentally "reuse" an imported symbol because of a missing
local symbol declaration. For instance, if you import a module
foo which happens to export a symbol bar, and then you
later define a function bar in your script, your definition will
use the imported bar symbol. This is perfectly legal in Q, and
may be intended, but if it's not then you have to explicitly declare a
new local bar symbol after the import clause,
cf. Declarations. If you forget this, you will errorneously
redefine the existing bar function instead. One way to avoid this
is to always use qualified imports, as described below. Another
possibility is to run the interpreter with the --pedantic option,
in which case it will warn you about symbols from unqualified imports
which are referred to with unqualified identifiers, see Running Compiler and Interpreter, for details.
As indicated in the syntax rules, the names of the modules to be imported can either be specified using an unqualified identifier, or a string denoting a full or relative pathname, e.g.:
import "/home/ag/q/stuff.q"; |
In this case, you can also omit the `.q' suffix, it will be
supplied automatically when necessary. Moreover, the module identifier
is automatically the basename of the script filename (i.e., stuff
in the above example); therefore, it is a good idea to choose basenames
which are legal Q identifiers so that you can use them in qualified
symbols.
If no absolute path is specified, the interpreter locates script files using its "search path", which usually contains the current directory and other system-dependent or user-defined locations where "library" scripts are kept, see Using Q, for details.
The `as' keyword can be used to explicitly specify an unambiguous module name. This is useful, in particular, if the basename of the module is not a valid identifier, and for the purpose of resolving name clashes. For instance:
import "my/stdlib.q" as mystdlib; |
Simply importing "my/stdlib.q" under its own name in this case
would produce an error message, because the name of the module collides
with the standard library module of the same name, and the compiler
enforces that all module names are unique.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
A qualified import takes the form:
from foo import foo, BAR; |
This specifically imports the symbols foo and BAR from
module foo, and nothing else. What this actually does is to
"redeclare" the imported symbols in the namespace of the importing
module, as explained in Cross-Checking of Declarations and Aliases, so that they look like private symbols of the importing
module. Thus you can refer to them, e.g., simply as foo or with
the qualified name gnats::foo, where gnats is the name of
the importing module. Note that foo::foo will not
work in this case, because a qualified import does not bring the
module itself into scope. (However, as pointed out below you can use
both a qualified and an unqualified import in concert to make
both gnats::foo and foo::foo work.)
A note on terminology: In Q parlance, the terms qualified import and unqualified import apply to the import clauses themselves; a qualified import clause is called "qualified" simply because it restricts the set of imported symbols. Unqualified and qualified identifiers can be used with both styles of import clauses alike.
As with unqualified import, there is a second kind of qualified import clause which also reexports the imported symbols, by making them public members of the importing module:
from foo include foo, BAR; |
For convenience, you can also write:
from module import; |
or
from module include; |
without listing any symbols to just import or include all symbols of the given module qualified. (It goes without saying that this should be used with care.) This is roughly equivalent to `import module' or `include module', respectively, but the imported symbols are made available as members of the namespace of the client, not the imported module.
If this sounds a bit confusing, here is an example which should clarify
the differences between unqualified and qualified imports. Let's assume
the following three modules A, B and C:
/* A: */ import B; /* B: */ include C; /* C: */ public foo; |
Using this setup, the symbol foo is available as either just
`foo' or as `C::foo' in all of A, B and
C. Now suppose we change module B to use a qualified
`include' instead:
/* B: */ from C include foo; /* or just: from C include; */ |
C's namespace hasn't changed, of course, but in both A and
B the symbol foo is now available as `foo' or
`B::foo', but not as `C::foo' anymore. However, you can
also combine unqualified and qualified imports, like this:
/* B: */ include C; from C include foo; |
Now the symbol foo is available as either `foo',
`B::foo' or `C::foo' in B and C (and the
compiler will recognize it as the same symbol no matter which notation
you use).
Finally, note that since symbols imported using a qualified clause
become members of the importing namespace, they must not collide with
other symbols declared either locally or in another qualified import
clause. Thus, if you have two modules foo1 and foo2 which
both export their own (local) symbol foo, the following will
provoke an error message from the compiler because of the clash between
the two different foo symbols:
from foo1 import foo; from foo2 import foo; |
In such cases you must either use unqualified import, or employ an `as' clause to rename the symbols while importing them. E.g.:
from foo1 import foo as foo1; from foo2 import foo as foo2; |
Note that this is only a problem if the imported symbols are really distinct. If the symbols are just different "aliases" of the same symbol defined elsewhere, then you can just import them under the same name into the same scope without any problems.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Actually, there is yet another kind of import, namely the implicit
import of the prelude.q script at the beginning of each script,
which in turn includes most standard library modules (see The Standard Library) and makes them readily available in your program,
without having to use an explicit import declaration. When looking up an
unqualified symbol in a given script, the compiler first searches for a
symbol defined in that script, then for a symbol in an imported or
included module, then for a symbol defined by the prelude and its
includes, and finally for a built-in symbol.
You can instruct the compiler to inhibit the default import of the
prelude, by using the --no-prelude option, see Using Q. Moreover, you can also override the standard prelude with your own,
if it occurs on the search path before the standard prelude, see
Setting up your Environment.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The namespace available in the interpreter is always that of the main
script, i.e., the script given on the command line. This namespace also
includes the built-in namespace which contains all the built-in
symbols, as well as the implicit imports, i.e., the prelude and all its
includes (unless the --no-prelude option was used). The built-in
namespace can be accessed explicitly by using an empty module qualifier,
as in ::sin. Qualified prelude symbols use the corresponding
module name as the qualifier (e.g., stdlib::cat), just as with
explicit imports.
The interpreter also allows you to dynamically import additional modules
in the global scope using the import command, see Command Language. Moreover, to facilitate testing and debugging, in the
interpreter it is possible to gain access to all public and
private symbols of the program (also in modules not directly imported in
the main script) using qualified identifiers.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The Q language allows you to declare function and (free) variable
symbols explicitly, by means of the syntactic constructs discussed in
this chapter. Symbol declarations are mostly optional; if you introduce
a new symbol without declaring it, the compiler declares it for
you. However, you will sometimes wish to ensure that a new symbol is
created to override a symbol of the same name from an imported module,
or you want to attach special attributes to a symbol, and then you have
to use an explicit declaration. Explicit declarations are also needed to
introduce new operator and type symbols. Moreover, while implicit
declarations are often convenient when you want to get something done
quickly, for larger projects you might prefer to play it safe and ensure
that each function symbol actually has a proper explicit declaration. In
such cases you can run the compiler with the --pedantic option
(see Running Compiler and Interpreter) to warn you about
undeclared occurrences of an identifier.
| 5.1 Declaration Syntax | ||
| 5.2 Operator Declarations | ||
| 5.3 Cross-Checking of Declarations and Aliases | ||
| 5.4 Type Declarations |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Syntactically, symbol declarations take the following form:
declaration : prefix headers ';'
| [scope] 'type' unqualified-identifier
[':' identifier] ['=' sections] ';'
| [scope] 'extern' 'type' unqualified-identifier
[':' identifier] ['=' sections] ';'
| [scope] 'type' qualified-identifier
['as' unqualified-identifier] ';'
| [scope] 'type' unqualified-identifier
'==' identifier ';'
prefix : scope
| [scope] modifier {modifier}
scope : 'private'|'public'
modifier : 'const'|'special'|'extern'|'var'|'virtual'
headers : header {',' header}
header : unqualified-identifier '=' expression
| unqualified-identifier
{['~'] variable-identifier}
| qualified-identifier
{['~'] variable-identifier}
['as' unqualified-identifier]
| '(' unqualified-opsym ')'
{['~'] variable-identifier}
['@' precedence]
| '(' qualified-opsym ')'
{['~'] variable-identifier}
['@' precedence]
['as' unqualified-opsym]
precedence : unsigned-number
| '(' operator ')'
sections : section {'|' section}
section : [prefix] headers
|
For instance, the following are all valid symbol declarations:
public foo; private extern bar X; public special lambda X Y; special cond::ifelse ~P X Y as myifelse; public (--) Xs Ys; private (::++) Xs Ys as concat; const red, green, blue; var FOO, BAR; var const e = exp 1; type Day = const sun, mon, tue, wed, thu, fri, sat; public type BinTree = virtual bintree Xs | private const nil, bin X T1 T2; |
The keywords private and public specify the scope of
a symbol. Public symbols are accessible outside a script, and can
be imported by other modules (see Scripts and Modules). If the
keywords private and public are omitted, the scope
defaults to private.
The special keyword serves to declare special forms, which
are described in Special Forms. Function symbols declared with
const introduce "constant" or "constructor" symbols; the
compiler enforces that expressions created with such symbols are not
redefined in an equation (cf. Equations). The built-in constants
true, false, [] and () are already
predeclared as const. The virtual keyword serves to
declare "virtual constructors" which can be used to define concrete
representations (so-called "views") of abstract data types, see
Views. Non-const function symbols can also be declared as
extern, meaning that the corresponding function is actually
implemented by a corresponding "external" module, see C Language Interface.
The var keyword allows you to declare "free" variable symbols,
which can be assigned a value by means of a variable definition,
see Free Variables. If you use such a declaration, the variable
symbol may also start with a lowercase letter (if it has not already
been declared as a function symbol); note that without such a
declaration, an identifier starting with a lowercase letter will
implicitly be declared as a function symbol. (The idea behind this is
that you can make a variable look like a function symbol, if the
variable is actually supposed to be used for function values.)
Variable symbols can also be declared as const; in this case the
variable can only be assigned to once and always refers to the
same value once it has been defined. The built-in variables
INPUT, OUTPUT, ERROR and ARGS are
predeclared as const, see Command Language. A variable
declaration may also contain an "initializer" of the form = X,
where X is an expression. This has the effect of declaring and
defining the variable in a single step.
As indicated, the scope-modifier prefix of a declaration is followed by
a comma-separated list of headers. The first identifier in each
header states the symbol to be declared. In function symbol declarations
the function identifier may be followed by a list of variable
identifiers which are used to denote the arguments of the declared
function symbol. The variables are effectively treated as comments; only
their number (called the arity of a function symbol) is recorded
to check the consistency of different declarations of the same symbol,
and to specify the number of "special" arguments in a special
declaration. In special declarations variables may also be
prefixed with `~' to declare them as "non-special" arguments; see
Special Forms.
The first declaration of an unqualified identifier or operator symbol in
a module always introduces a new symbol in the module's namespace. By
default (if no explicit declaration precedes the first use of a new,
unqualified symbol), the compiler automatically declares it as a
private nullary function or variable symbol, depending on the
case of the initial letter of the identifier (as already mentioned in
Lexical Matters, capitalized identifiers are interpreted as
variable symbols, others as function symbols).
| [ |