[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The notion of expressions is an intrinsic part of most, if not all, programming languages. However, in most programming languages expressions are merely treated as descriptions of computations to be performed in order to obtain the value of an expression, which usually belongs to some basic type provided by the language. In contrast, in the Q language expressions are objects in their own right which, as we shall see, are manipulated according to computational rules specified by the programmer.
As usual, expressions are built from certain kinds of basic objects. In the Q language, these are integers, floating point numbers, strings, function and variable symbols. These objects are combined into larger, compound expressions by means of applications, tuples, lists and streams. For convenience, the Q language also has prefix, infix and mixfix operator symbols for the most common operations; these are actually treated as "syntactic sugar" for applications. Syntactic sugar is also provided for list, tuple and stream enumerations and comprehensions. The syntax of all these entities is described by the following grammatical rules:
expression : identifier | 'var' unqualified-identifier | variable-identifier ':' identifier | number | string | expression expression | unary-op expression | expression binary-op expression | 'if' expression 'then' expression ['else' expression] | '\' expression { expression } '.' expression | '(' [element-list|enumeration|comprehension] ')' | '[' [element-list|enumeration|comprehension] ']' | '{' [element-list|enumeration|comprehension] '}' | '(' op ')' | '(' expression binary-op ')' | '(' binary-op expression ')' element-list : expression-list [',' | ';' | '|' expression] enumeration : expression-list '..' [expression] comprehension : expression ':' expression-list expression-list : expression {',' expression} | expression {';' expression} |
6.1 Constants and Variables | ||
6.2 Applications | ||
6.3 Lists, Streams and Tuples | ||
6.4 Built-In Operators | ||
6.5 User-Defined Operators |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The nonterminals identifier
, op
, unary-op
,
binary-op
, number
and string
occuring in the syntax
rules above refer to the lexical entities introduced in Lexical Matters. Note that the Q language actually distinguishes two different
types of identifiers (function and variable identifiers), two different
kinds of operator symbols (unary and binary) and two different types of
numeric quantities (integers and floating point numbers). Integers are
implemented as "bignums" using the GNU multiprecision package (GMP);
thus the size of an integer value is only limited by available
memory. Floating point values are implemented using 64 bit (i.e., double
precision) floating point numbers; on most machines, these should
provide nonzero absolute values ranging from 1.7E-308 to 1.7E308 and a
precision of 15 decimal digits.
String constants are character sequences internally represented as character
arrays permitting constant-time access to the individual characters of a
string. There is no a priori length restriction for string constants. In the
current implementation, the null character \0
is reserved as a string
terminator and must not be used as an ordinary string character.
Furthermore, the Q language provides five other built-in constants which
denote the truth values (true
and false
), and the empty
list, stream and tuple ([]
, {}
and ()
, to be
discussed in Lists, Streams and Tuples).
A variable symbol can be "bound" or "free", depending on the context
in which it occurs. We say that a variable is bound in an equation
if it occurs on the left-hand side of the equation. Otherwise, the
variable is a free variable. In this case, the variable may denote
a value (introduced with def
), or simply stands for itself. In
any case, a variable is a legal atomic expression which may stand
wherever a constant is allowed.
On the left-hand side of equations, it is possible to declare
that a variable is of a given type by using the notation:
variable:type
. This requires that you have previously
declared type as a type identifier, see Declarations. When a
type identifier is specified with a variable, the variable will only
match values belonging to the given type, cf. Type Guards.
On the right-hand side of equations, an unqualified identifier
can also be prefixed with the keyword var
to indicate that the
identifier is to be treated as a free variable symbol. Such inline
variable declarations can be used to escape free variables (if the same
identifier is also introduced as a bound variable on the left-hand side
of the equation) and to declare free variable symbols on the fly, see
Local Variables, for details.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Application is probably the most important single construct in the Q
language. It allows you to apply one object (the "function") to
another one (the "argument"). This construct is used to denote
"function calls" such as sqrt 2
as well as "constructor
terms" such as bin X T1 T2
which encode tree-like data
structures. Also, as we will see in Built-In Operators, operators
like +
, -
, etc. are merely syntactic sugar for
applications.
As in other contemporary functional languages, application is a binary
operation written simply as juxtaposition: X Y
denotes the
application of X
to Y
, both of which may be arbitrary
expressions themselves. Application associates to the left; the
construct X Y1 … Yn
is equivalent to (…((X Y1)
Y2) …) Yn
, and denotes the application of X
to n arguments
Y1
, …, Yn
. This style of writing function
applications is commonly referred to as currying, after the
American logician H.B. Curry. We will have to say more about this
shortly.
Here are some valid examples of applications:
sqrt 2 sqrt (Y+1) foo X (bar (Y-1)) Z |
Note that since application is left-associative, nested applications in
arguments must be set off with parentheses. For instance, foo (bar X)
applies foo
to bar X
, whereas foo bar X
applies
foo
to two arguments bar
and X
.
Since currying is so ubiquitous in functional programming, you should be well familiar with it, so let us briefly explain what it is, and what it is good for.
Functions of multiple arguments can generally be defined in two different ways:
max
, which computes the maximum of two values, as follows:
max (X,Y) = X if X>Y; = Y otherwise; |
max
function:
max X Y = X if X>Y; = Y otherwise; |
Here max X Y
is to be read as (max X) Y
, meaning that by
applying max
to the first argument X
, we obtain another
function max X
, which, when applied to the second argument
Y
, yields the maximum of X
and Y
. (Incidentally,
this curried form is also the way that the max
function is
actually defined in the standard library. In fact, most built-in and
standard library functions use the curried form as it is more flexible.)
The Q language supports both kinds of notations. Choosing between the
two is more than just a matter of taste. Besides saving parentheses,
curried functions provide a cheap way to derive new functions through
"partial applications" of a multi-argument function. For instance,
given the curried definition of max
from above, max 0
can
be used to denote the function computing the "nonnegative part" of its
argument (which is the argument itself if it is nonnegative, and zero
otherwise). This does not work with the uncurried definition since that
definition requires us to specify both arguments of max
in one
chunk; instead, we would have to start defining the derived function
from scratch.
Uncurried definitions also have their merits. In particular, they allow
us to define variadic functions, i.e., functions which can handle
a varying number of arguments. For instance, the following definition
enables us to apply the max
function to both pairs and triples:
max (X,Y) = X if X>=Y; = Y otherwise; max (X,Y,Z) = max (X,max (Y,Z)) |
In fact, in the Q language it is possible to define generic functions
which apply to any number of arguments specified as a tuple. For
instance, the following version of max
handles tuples of any size
at least 2:
max (X,Y) = X if X>=Y; = Y otherwise; max (X,Y|Zs) = max (X,max (Y|Zs)); |
(In the above example, the notation (…|Zs)
is used to
denote a tuple with "tail" Zs
. This is explained in the
following section.)
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The Q language provides three closely related general constructs for representing sequences of objects: lists, streams and tuples.
We start out with a discussion of the list data structure. The
constant []
denotes the empty list. In general, a list consisting
of elements X1
, X2
, …, Xn
is denoted
[X1,X2,…,Xn]
. For instance, [a,b,c]
consists of
three elements (symbols) a
, b
and c
. In contrast to
statically typed languages like Haskell and ML, list members may have
different types; e.g., [a,10,"a string"]
is a perfectly
well-formed list consisting of a symbol, a number and a string. It is
also possible to have nested lists, as in [a,[b,c]]
which
consists of two elements, the symbol a
and the list [b,c]
.
You can also have a trailing comma in a list, as in [a,b,c,]
.
The trailing comma is ignored, so the above is exactly the same as
[a,b,c]
. This makes it possible to format a list as follows,
which lets you add new items at the end more easily:
def ITEMS = [ "This is the first item", "This is the second item", … "This is the last item", ]; |
As in Prolog, lists are actually represented in a right-recursive
fashion using a binary constructor [|]
which takes as its
arguments the head element of the list and the list of the
remaining elements, called the tail of the list. Thus, e.g.,
[a,b,c]
is simply a shorthand notation for [a|[b,c]]
,
which denotes a list with head a
and tail [b,c]
(which is
in turn a shorthand for [b|[c]]
, etc.). You can also mix both
styles of notation; for instance, [a,b|[c,d]]
is another way to
represent the 4-element list [a,b,c,d]
.
Note that [a|[b,c]]
is different from [a,[b,c]]
: the
former denotes a three-element list, while the latter is a two-element
list whose second member happens to be a list itself. Also note that the
[|]
constructor can in fact be applied to any pair of values (the
second value does not have to be a list); e.g., [a|b]
is a
perfectly well-formed expression (although the built-in length, indexing
and concatenation operations described in String, List and Tuple Operators, will fail on such values).
Q also provides so-called enumerations to construct lists from a
range of values. These usually take the form
[X1,X2,…,Xn..Y]
which is just syntactic sugar for the
application enum [X1,X2,…,Xn] Y
involving the built-in
enum
function, see Enumeration Types, for
details. Syntactic sugar is also provided for list comprehensions,
which are discussed in Conditionals and Comprehensions. Analogous
constructs are also provided for streams and tuples.
Streams look exactly like lists except that they are written with
curly braces. The difference between lists and streams is that the head
and tail of a stream are not evaluated until their values are actually
required during the course of a computation. For instance, you will find
that if you enter an expression like {1+1,2+2,3+3}
in the
interpreter, then the value of that expression will be just the
expression itself, because the embedded applications of the `+'
operator are not evaluated:
==> {1+1,2+2,3+3} {1+1,2+2,3+3} |
In contrast, if you enter the same sequence as a list, its elements will be evaluated immediately:
==> [1+1,2+2,3+3] [2,4,6] |
A stream element will be evaluated automatically when it is accessed, e.g., using the index operator:
==> {1+1,2+2,3+3}!1 4 |
This is also known as lazy or non-strict evaluation: the evaluation of the stream elements is deferred until they are actually needed. In fact, the entire tail of a stream may be deferred, which makes it possible to create infinite sequences whose members are produced "on the fly" while traversing the sequence.
Streams are implemented in terms of so-called "special forms" which
will be introduced in Special Forms. We therefore postpone a
closer description of this data structure until Special Constructors and Streams. Also note that the Q language itself only
defines the stream constructors; most other stream functions are
actually implemented by the standard library script stream.q
, see
Streams.
Tuples also work in much the same fashion as lists: The constant
()
denotes the empty tuple, and a tuple consisting of elements
X1
, X2
, …, Xn
is written as
(X1,X2,…,Xn)
, which is equivalent to
(X1|(X2|…|(Xn|())))
, where the notation (X|Xs)
denotes a tuple consisting of a first element X
and the tuple
Xs
of the remaining elements. A trailing comma may follow the
last tuple element, so that (a,b,c,)
is the same as
(a,b,c)
. The trailing comma is optional in all cases except one:
As in the Python programming language, 1-tuples are indicated with a
trailing comma, to distinguish them from ordinary parenthesized
expressions. Hence (a,)
denotes the 1-tuple with the single
member a
, while (a)
just denotes a
itself.
Note that in contrast to languages like ML and Haskell, tuples are not
actually needed to represent sequences of values with different types,
as Q's list data structure already allows you to do that. Also, Q's
tuples are a lot more flexible than in ML and Haskell, as they can be
handled mostly like list values. The big difference between lists and
tuples in the Q language is that, while lists are always represented as
recursive data objects using a binary constructor symbol (just the way
that they are written), tuples are actually implemented as "vectors"
which are stored as contiguous sequences in memory. (This only works for
"well-formed" tuples. If the tail Xs
of a tuple (X|Xs)
is not a tuple, then this value can only be represented using an
explicit application of the tuple constructor.) Therefore tuples
normally use up much less memory than lists of the same size, and they
also allow constant-time access to their members. The size of a tuple
can be determined in constant time as well. In contrast, the same
operations, when applied to a list, require time proportional to the
size of the list. On the other hand, lists are more efficient when
accessing the tail of a list and when a new element is prepended to a
list, which can both be done in constant time. Here a tuple needs time
proportional to its size, since the member sequence of the original
tuple must be copied when accessing its tail or when prepending an
element.
These tradeoffs should be carefully considered when deciding whether to
implement a given sequence as a list or a tuple. Tuples are usually the
best choice for implementing fixed sequences requiring fast random
access to its individual members, whereas lists provide the most
efficient representation if the elements are traversed and manipulated
sequentially. If necessary, you can easily convert between these two
representations using the built-in functions list
and
tuple
, see Conversion Functions.
Lists, streams and tuples are frequently combined to create nested
structures like a tuple of lists and/or streams, a list (or stream) of
lists, or a list (or stream, or tuple) of tuples. The latter construct
(sequences of tuples) is so common that Q provides special support for
it in the form of the following "grouping" syntax. The notation
[X1, X2, …; Y1, Y2, …; …]
is a shorthand for
[(X1, X2, …), (Y1, Y2, …), …]
. Note that there
must be at least one element in each group (but the number of elements
in different groups may vary), and there must be at least one group (a
single group is denoted with a trailing `;'). Thus, for instance,
[A,B;C,D;X,Y,Z]
is the same as [(A,B),(C,D),(X,Y,Z)]
, and
[X,Y,Z;]
is the same as [(X,Y,Z)]
. The same notation also
applies to streams and tuples.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Besides the forms of expressions already discussed above, the Q language also provides a collection of prefix, infix and mixfix operator symbols for common arithmetic, logical, relational and other operations. A complete list of these symbols is given below, in order of decreasing precedence:
' ` ~ &
quotation operators (unary)
.
function composition
^ !
exponentiation and subscript
- # not
prefix operators (unary)
* / div mod and and-then
multiplication operators
++ + - or or-else
addition operators
< > = <= >= <> ==
relational operators
$
infix application operator
if-then-else
conditional expressions
||
sequence operator
\X.Y
lambda abstractions
Most of these symbols have their usual meaning; a closer description
follows below. All binary operators are left-associative, with the
exception of ^
, !
and $
which associate to the
right, and the relational operators which are
nonassociative. Application takes precedence over all these operations
except the quotation operators; hence sqrt X^3
denotes
(sqrt X)^3
and not sqrt (X^3)
. The quotation
operators have the highest possible precedence, and hence bind stronger
than even applications. Parentheses are used to group expressions and
overwrite default precedences and associativity as usual. C programmers
will also note that the logical operators have the same "wrong"
precedence as in Pascal. Thus you should make sure that you always
parenthesize relational expressions when combining them with logical
connectives.
You should also note that unary minus must be parenthesized when
appearing in an argument of a function application. For instance,
although foo X -Y
is a perfectly well-formed expression, it
denotes (foo X) - Y
and not foo X (-Y)
as you might
have intended by the spacing which is however ignored by the Q
compiler. As already noted in Lexical Matters, unary minus in
front of a number is special; it causes values such as -2
to be
interpreted as negative numbers rather than denoting an explicit
application of the unary minus operator (an explicit application of
unary minus can be denoted using the built-in minus
symbol; see
below). The rules for parenthesizing negative numbers are the same as
those for unary minus; you have to write foo X (-2)
instead of
foo X -2
(which denotes (foo X) - 2
).
In the Q language, expressions involving prefix and infix operators are
merely syntactic sugar for applications. By enclosing such an operator in
parentheses, you can turn it into an ordinary prefix function. For
instance, X+Y
is exactly the same as (+) X Y
, and
not X
actually denotes the application (not) X
. The
built-in operators simply provide a more convenient way for applying
these operations to their arguments, which resembles common mathematical
notation. Moreover, you can also turn a binary operator into a unary
function by specifying either the left or the right operand. E.g.,
(1/)
denotes the reciprocal and (*2)
the doubling
function. Such constructs are commonly called operator
sections. Inside a section, the usual precedence and associativity
rules apply. For instance, the X+3
subterm in (*(X+3))
must be parenthesized because *
has a higher precedence
than +
, and hence the partial expression (*X+3)
is not
well-formed.
The -
operator plays a somewhat awkward role in the syntax, since
it is used to denote both unary and binary minus. Q adopts the
convention that the notation (-)
always denotes binary
minus; unary minus may be denoted by the built-in minus
function. Thus the expression minus X
applies unary minus to
X
. Note that this construct always denotes an explicit
application of the unary minus operation. For instance, minus 5
denotes the application of unary minus to the integer 5
, while
-5
is a negative integer. Also note that the construct
(-X)
is not an operator section, but a parenthesized
expression involving unary minus. The easiest way to construct an
operator section which subtracts a given value from its argument is to
formulate the function using the addition operator as in (+(-X))
.
Another special case are the mixfix operators if-then-else
(conditional expression) and \X.Y
(lambda abstraction). These are
just syntactic sugar for the standard library functions
ifelse
/when
and the built-in function lambda
, so no
special syntax is needed to turn them into prefix functions.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The '
(quote), `
(backquote), ~
(tilde) and
&
(ampersand) operators are used to deal with so-called
special forms. The quote operator quotes an expression as a
literal; it is a constructor symbol and hence becomes part of the quoted
expression. The backquote and tilde operators are used to "splice" and
"force" subterms in an expression. The ampersand operator causes
"special arguments" to be memoized. We postpone a discussion of these
operations until Special Forms.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The `.' operator denotes function composition: (F.G) X = F (G
X)
. For instance, to double a value and then add 1 to it, you can use
an expression like ((+1).(*2)) X
. Note the parentheses around the
composition - as application binds stronger than composition, F.G
X
is interpreted as F.(G X)
rather than (F.G) X
. If a
composition involves applications with integer arguments, extra
whitespace may be needed to distinguish between the `.' operator
and the decimal point; e.g., F 5 . G
denotes a composition of the
functions F 5
and G
, whereas F 5.G
is an
application of the function F
to two arguments, the floating
point value 5.
and the function G
. Also note that the
composition operator is associative, since (F.G.H) X = ((F.G).H) X
= F (G (H X)) = (F.(G.H)) X
.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The operators +
, -
, *
, /
denote the sum, the
difference, the product and the quotient of two numeric values,
respectively. As already noted, -
is also used as unary
minus. The operators div
and mod
denote integer quotient
and modulo, respectively; they only apply to integers. The ^
operator denotes exponentiation, as defined by X^Y = exp (ln
X*Y)
; it always returns a floating point value. (If X<0
then
X^Y
is defined only if Y
is an integer. 0^0
is left
undefined as well, so if you would like to have that 0^0 = 1
then
you must add corresponding equations yourself. Also note that the
complex.q
standard library module extends the built-in definition
of the exponentiation operator to handle the case that X<0
with
general exponent; see Complex Numbers.)
The argument and corresponding result types of these operations are
summarized in the following table (Int
denotes integer,
Float
floating point, and Real
integer or floating point
values):
+ - *
Int
Int
-> Int
Int
Float
-> Float
Float
Int
-> Float
Float
Float
-> Float
/ ^
Real
Real
-> Float
div mod
Int
Int
-> Int
- (unary)
Int
-> Int
Float
-> Float
Note that, as of Q 7.2, all floating point operations properly deal with
IEEE floating point infinities and NaN ("not a number")
values. Consequently, division by zero now yields one of the special
values inf
, -inf
or nan
, depending on the value of
the first operand. (Of course, this will only work as described on
systems with a proper IEEE floating point implementation. The
interpreter makes no attempt to emulate IEEE floating point values on
systems which do not provide them natively.)
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The operators <
, >
, =
, <=
, >=
,
<>
are binary predicates meaning "less", "greater",
"equal", "less or equal", "greater or equal" and "not equal",
respectively. The built-in definition of these operations only applies
to numbers, strings and truth values. All relational operators return
truth values (true
, false
) as results. Strings are
compared lexicographically, on the basis of the character encoding
(which, as of Q 7.0, usually is Unicode). Truth values are ordered by
false < true
.
If you would like to compare other types of values than the basic
objects mentioned above, normally you will have to provide suitable
definitions yourself. For instance, you might wish to extend the
equality operation to other built-in and user-defined data structures
such as lists, trees, etc., by "overloading" the =
operator
accordingly. The following equations implement an equality predicate on
lists (the parentheses on the left-hand side are necessary to prevent
the equality operation to be interpreted as the equal sign separating
left-hand and right-hand side of an equation):
([] = []) = true; ([] = [Y|Ys]) = false; ([X|Xs] = []) = false; ([X|Xs] = [Y|Ys]) = (X=Y) and then (Xs=Ys); |
Rules for other comparison operators (<>
, <=
, etc.) could
be added in the same fashion. Actually, the standard library provides
corresponding definitions; see The Standard Library.
A special case arises with types consisting of only nullary constructor
symbols declared with const
, so-called enumeration types,
see Types. The values of such a type can always be compared with
the relational operators, using the order in which they are listed in
the type declaration. (A special case of this is the order of the
built-in truth values.) For instance, assume that the Day
type is
declared as follows:
type Day = const sun, mon, tue, wed, thu, fri, sat; |
Then the listed constants will be ordered as sun < mon < … <
sat
.
Besides the equality predicate, the Q language also provides a notion of "syntactic" equality, implemented by the `==' operator, which applies to all kinds of expressions; see Non-Linear Equations, for details.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The logical operations not
, and
, or
denote logical
negation, conjunction and disjunction, respectively. These operators
take truth values as their arguments. They are defined in a
straightforward manner:
not true = false; not false = true; true and true = true; true and false = false; false and true = false; false and false = false; true or true = true; true or false = true; false or true = true; false or false = false; |
Like most other programming languages, Q also has logical connectives for
the short-circuit evaluation of logical expressions, which are
denoted by the operators and then
and or else
. These
operations are actually implemented as "special forms" which evaluate
their arguments from left to right only as far as required to determine
the value of the expression (cf. Special Forms). They are defined
by the following built-in equations:
true and then X = X; false and then X = false; false or else X = X; true or else X = true; |
The first operand is always evaluated. Depending on its value, the
second operand may not be evaluated at all. For instance, if X
evaluates to false
, then X and then Y
immediately
reduces to false
, without ever having to evaluate the second
argument Y
. On the other hand, if X
evaluates to
true
, then Y
is evaluated and returned as the value of the
expression. The or else
operation works analogously.
One reason for using short-circuit evaluation is efficiency: prevent unnecessary computations when an initial part of a logical expression already determines the value of the entire expression. Furthermore, short-circuit evaluation is sometimes essential to check a condition before evaluating an expression depending on the validity of this condition. For instance:
(X <> 0) and then (foo (1/X) > 0) |
You should also note that, according to the above definitions, X
and then Y
and X or else Y
are always reduced if X
is a truth value, even if the second operand Y
does not
evaluate to a truth value. This may look a bit weird at first, but it is
in fact the most reasonable way to implement short-circuit evaluation in
the Q language, since Q uses dynamic typing, and hence the type of an
expression is only known after it has been evaluated. In fact,
this "feature" can be quite useful at times. For instance, you can
also use and then
to write down a simple kind of conditional
expression like
check X and then bar X |
where bar X
is the value you wish to compute, but only if the
condition check X
holds.
The Q interpreter also uses the operators not
, and
and
or
to denote bitwise logical operations on integers. Thus,
not X
denotes the one's complement of an integer X
, and
X and Y
and X or Y
the bitwise logical conjunction and
disjunction of integers X
and Y
, respectively. These
operations behave as if negative integers were represented in two's
complement (although GMP actually uses a sign-magnitude
representation). This means that for each integer X
, -X =
not X + 1
, or, equivalently, not X = -X-1
. For instance:
==> 17 and not 13 16 ==> 17 or not 13 -13 ==> not _ 12 |
These results become clear when considering that 17
has bits 0
(=1) and 4 (=16) on (and all other bits off), while not 13
has
bits 0 (=1), 2 (=4) and 3 (=8) off (and all other bits on). Thus,
writing these numbers as unsigned bit sequences with most significant
bits first, where ...1
denotes an infinite sequence of leading
1's, we have:
17 and not 13 = 10001 and not 1101 = 10001 and ...10010 = 10000 = 16 17 or not 13 = 10001 or not 1101 = 10001 or ...10010 = ...10011 (= not 1100 = not 12) = -13 |
Together with the built-in bit shift operations, the bitwise logical operations can also be used to implement "bitsets", i.e., sets of nonnegative integers represented by bit patterns. See Arithmetic Functions, for details.
In case you are wondering about "exclusive or:" While this operator is not provided as a builtin, you can easily define it yourself as follows:
public (xor) X Y @ 3; X xor Y = (X or Y) and not (X and Y); |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The ++
operator denotes concatenation, #
is the length or
size operator, and the subscript operator !
is used for
indexing. These operations work consistently on strings, lists and
tuples. For instance:
"abc"++"xy" ⇒ "abcxy" [a,b,c]++[x,y] ⇒ [a,b,c,x,y] (a,b,c)++(x,y) ⇒ (a,b,c,x,y) #"abc" ⇒ 3 #[a,b,c] ⇒ 3 #(a,b,c) ⇒ 3 "abc"!1 ⇒ "b" [a,b,c]!1 ⇒ b (a,b,c)!1 ⇒ b |
Indexing with the subscript operator starts at zero, so that X!0
and X!(#X-1)
denote the first and last member of a string, list
or tuple, respectively. Also note that since !
is
right-associative, double subscripts may have to be parenthesized. For
instance, compare (X!I)!J
and X!I!J
=X!(I!J)
.
All list and tuple operators check that their first argument is a
well-formed list or tuple value. However, the second argument of the
++
operator may in fact be any value. List concatenation just
replaces the empty sublist marking the end of its first list argument
with the second argument, as if it was defined by the following
equations:
[]++Ys = Ys; [X|Xs]++Ys = [X|Xs++Ys]; |
Hence we have that, e.g., []++1 ⇒ 1
and [1,2]++3
⇒ [1,2|3]
, which may look odd, but is consistent with the above
equations. Tuple concatenation works in an analogous manner.
All operators discussed in this section are also available for streams
(cf. Lists, Streams and Tuples). However, the corresponding
definitions are not built into the language but are provided in the
standard library script stream.q
, see Streams.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The application operator `$' and the sequence operator `||'
have the lowest precedence among the prefix and infix operators. The
$
operator is right-associative and binds stronger than the
||
operator, which is left-associative.
The infix application operator provides an alternative way to denote
function application, i.e., F $ X ⇒ F X
. This operator
comes in handy when you want to pass around function application as a
function parameter. Since this operator has very low precedence and is
right-associative, it also provides a convenient way to write down
"cascading function applications," such as foo $ bar $ X+Y = foo
(bar (X+Y))
, which can save a lot of parentheses and make an expression
more readable.
The sequence operator lets you evaluate sequences of expressions in a given order. The value of the sequence is given by the rightmost expression. That is,
X || Y ⇒ Y. |
The sole purpose of this construct is to allow operations with side-effects (such as I/O functions) to be carried out sequentially. A typical example is
writes "Input: " || reads |
which prints a prompt on the terminal and then reads in a string. (The
built-in functions writes
and reads
are described in
Built-In Functions.)
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
As of version 7.1, Q also provides special syntax for two other kinds of
constructs which are commonly encountered in functional programs:
conditional expressions and lambda abstractions. A conditional
expression takes the form if X then Y else Z
where X
,
Y
and Z
are arbitrary expressions. It returns the value of
Y
if the value of X
is true
, and the value of
Z
if the value of X
is false
. The `else' part
can also be omitted, in which case ()
is returned if the value of
X
is false
. Like the built-in logical connectives
and then
and or else
, conditional expressions are special
forms which are evaluated in short-circuit mode; thus, if X
evaluates to false
, then Y
will never be evaluated.
Lambda abstractions can be denoted using the customary notation
\X.Y
where X
is an expression denoting a pattern to be
matched against the argument of the lambda function, and Y
is
another expression, the lambda body which is to be evaluated when the
lambda function is applied to an argument. Note that if the argument
pattern X
is not an atomic expression (i.e., not a basic
expression, list, stream or tuple) then it must be parenthesized; the
lambda body Y
never has to be parenthesized, however, because of
the low precedence of the lambda construct. Multi-argument lambdas can
be written using the notation \X1 X2 … . Y
, which is just a
shorthand for \X1 . \X2 . … . Y
. Lambdas are special forms;
neither the parameter patterns nor the body are evaluated, rather they
are "compiled" (at runtime) to a special kind of function object which
can be applied to the actual arguments in an efficient manner, see
Lambda Abstractions, for details.
Like the other operators, both if-then-else
and the \X.Y
construct are merely syntactic sugar for function applications. The
if-then-else
operator is actually implemented by the standard
library function ifelse
(or the when
function if the
`else' part is omitted), see Conditionals and Comprehensions. The \X.Y
construct translates to an application
of the built-in lambda
function, see Lambda Abstractions.
Note that the \X.Y
construct has the lowest possible precedence,
even lower than the sequencing operator `||', so it always has to
be parenthesized unless it forms a toplevel expression or the body of
another lambda construct. The if-then-else
operator binds
stronger than `||', but weaker than `$', and "dangling
else
" parts are assumed to belong to the most recent
if-then
. This makes imperative-style code like the following
behave as expected:
test = \X. writes "The value is " || if X > 0 then writes "positive.\n" else if X < 0 then writes "negative.\n" else writes "zero.\n"; |
Also note that if-then-else
needs to be parenthesized unless it
occurs as a toplevel or inside a lambda, sequence or another conditional
expression:
test = \X. writes $ "The value is " ++ (if X > 0 then "positive" else if X < 0 then "negative" else "zero") ++ ".\n"; |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
As of version 6.2, the Q language also lets you you define your own prefix and infix operator symbols, using a declaration like the following (see also Declarations):
public (myop) X Y; |
This effectively turns myop
into a keyword which can be used as
an infix operator: X myop Y
. User-defined operators work exactly
like the built-in ones, in that you can turn them into a prefix function
by enclosing them in parentheses, as in (myop)
, and that you can
build sections from them which leave the left or right operand
unspecified, such as (X myop)
and (myop X)
.
By default, the new operator has the same precedence and associativity as the relational operators (which are at precedence level 2, and are non-associative). You can also explicitly specify a precedence level, like so:
public (myop) X Y @ 2; |
Alternatively, you may denote the precedence level by listing an existing operator symbol. The following is equivalent to the above declaration:
public (myop) X Y @ (<); |
As of Q 7.7, the operator symbol doesn't have to be an identifier, it can also be a sequence of punctuation characters as discussed in Operator Symbols. For instance:
public (===) X Y @ (=); |
As a real-world example, here is the definition of an xor
(exclusive-or) operator from Logical and Bit Operators. It uses
the third precedence level, which is occupied by the addition operators
(which also includes the built-in or
operator):
public (xor) X Y @ 3; X xor Y = (X or Y) and not (X and Y); |
In general, a quick glance at the operator table in the preceding
subsection reveals that valid precedence level numbers for the prefix
and infix operators go from 0 (lowest precedence, binary sequence
operator) to 9 (highest precedence, unary quotation operators), with 8
denoting the precedence of function application. (Note that there are no
numbered precedence levels for the if-then-else
and \X.Y
constructs because there is no way to declare such mixfix operators in
Q.) The complete list of built-in prefix and infix operators, cast as a
sequence of valid Q declarations, is shown below.
public (::||) X Y @0, (::$) X Y @1; public (::<) X Y @2, (::<=) X Y @2, (::>) X Y @2, (::>=) X Y @2, (::=) X Y @2, (::<>) X Y @2; public special (::==) X Y @2; public (::++) X Y @3, (::+) X Y @3, (::-) X Y @3, (::or) X Y @3; public special (::or else) ~X Y @3; public (::*) X Y @4, (::/) X Y @4, (::div) X Y @4, (::mod) X Y @4, (::and) X Y @4; public special (::and then) ~X Y @4; public (::#) X @5, (::not) X @5; public (::^) X Y @6, (::!) X Y @6; public (::.) X Y @7; public special const (::') X @9; public (::`) X @9, (::~) X @9, (::&) X @9; |
Note that precedence level 8 may not be used for user-defined operators. Other than that, all precedence levels are permitted, but the fixity (prefix/infix), the number of operands (unary/binary) and (unlike in ML and Haskell) also the associativity of operators (non-/left-/right-associative) are all determined by Q's system of built-in operators, as described in Built-In Operators.
While user-defined operators are often useful for extending Q's built-in operator system and to implement "domain-specific languages", the excessive use of obscure operator symbols, especially the use of multi-character punctuation looking like "line noise", may make your programs hard to read and is considered bad style.
As explained in Operator Symbols, some built-in operator symbols, such as `-', `=' and `.' are actually part of Q's syntax and need special treatment. These symbols may therefore not be redeclared as user-defined operators. The same applies to special symbols like `|', `..', `\', `@' and the keywords of the language. In general, you should also refrain from redeclaring other built-in operators since this will most likely be confusing for other programmers reading your code. Most of the time that you want to define, say, a `+' operation on a given data structure, extending the built-in `+' operator should work just fine; there's usually no need to declare a new `+' operator for that purpose.
You should also be aware that declaring new operator symbols actually amounts to changing the syntax of the Q language in small and sometimes subtle ways, both at compile- and runtime. The amount of havoc you can wreak on the syntax is somewhat limited by the fact that Q does not allow you to create your own precedence levels; all the available levels and the corresponding fixities of operators are prescribed by Q's expression syntax. Nevertheless, designing your own operator systems takes careful planning. In particular, since operator symbols are parsed using the maximal munch rule (see Operator Symbols), it is a bad idea to declare binary operator symbols like `--' or `<-' which end in a valid prefix of a unary operator (`-' in this case). Such ambiguities can break existing code and produce bugs which are hard to find. When in doubt, just use symbols from the extended Unicode character set; there are plenty of nice mathematical and other punctuation symbols in Unicode which are very useful for programming purposes.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated by Albert Gräf on February, 23 2008 using texi2html 1.76.