6. Expressions

The notion of expressions is an intrinsic part of most, if not all, programming languages. However, in most programming languages expressions are merely treated as descriptions of computations to be performed in order to obtain the value of an expression, which usually belongs to some basic type provided by the language. In contrast, in the Q language expressions are objects in their own right which, as we shall see, are manipulated according to computational rules specified by the programmer.

As usual, expressions are built from certain kinds of basic objects. In the Q language, these are integers, floating point numbers, strings, function and variable symbols. These objects are combined into larger, compound expressions by means of applications, tuples, lists and streams. For convenience, the Q language also has prefix, infix and mixfix operator symbols for the most common operations; these are actually treated as "syntactic sugar" for applications. Syntactic sugar is also provided for list, tuple and stream enumerations and comprehensions. The syntax of all these entities is described by the following grammatical rules:

expression              : identifier
                        | 'var' unqualified-identifier
                        | variable-identifier ':' identifier
                        | number
                        | string
                        | expression expression
                        | unary-op expression
                        | expression binary-op expression
                        | 'if' expression 'then' expression
                          ['else' expression]
                        | '\' expression { expression } '.' expression
                        | '(' [element-list|enumeration|comprehension] ')'
                        | '[' [element-list|enumeration|comprehension] ']'
                        | '{' [element-list|enumeration|comprehension] '}'
                        | '(' op ')'
                        | '(' expression binary-op ')'
                        | '(' binary-op expression ')'

element-list            : expression-list [',' | ';' | '|' expression]

enumeration             : expression-list '..' [expression]

comprehension           : expression ':' expression-list

expression-list         : expression {',' expression}
                        | expression {';' expression}

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

6.1 Constants and Variables

The nonterminals identifier, op, unary-op, binary-op, number and string occuring in the syntax rules above refer to the lexical entities introduced in Lexical Matters. Note that the Q language actually distinguishes two different types of identifiers (function and variable identifiers), two different kinds of operator symbols (unary and binary) and two different types of numeric quantities (integers and floating point numbers). Integers are implemented as "bignums" using the GNU multiprecision package (GMP); thus the size of an integer value is only limited by available memory. Floating point values are implemented using 64 bit (i.e., double precision) floating point numbers; on most machines, these should provide nonzero absolute values ranging from 1.7E-308 to 1.7E308 and a precision of 15 decimal digits.

String constants are character sequences internally represented as character arrays permitting constant-time access to the individual characters of a string. There is no a priori length restriction for string constants. In the current implementation, the null character \0 is reserved as a string terminator and must not be used as an ordinary string character.

Furthermore, the Q language provides five other built-in constants which denote the truth values (true and false), and the empty list, stream and tuple ([], {} and (), to be discussed in Lists, Streams and Tuples).

A variable symbol can be "bound" or "free", depending on the context in which it occurs. We say that a variable is bound in an equation if it occurs on the left-hand side of the equation. Otherwise, the variable is a free variable. In this case, the variable may denote a value (introduced with def), or simply stands for itself. In any case, a variable is a legal atomic expression which may stand wherever a constant is allowed.

On the left-hand side of equations, it is possible to declare that a variable is of a given type by using the notation: variable:type. This requires that you have previously declared type as a type identifier, see Declarations. When a type identifier is specified with a variable, the variable will only match values belonging to the given type, cf. Type Guards.

On the right-hand side of equations, an unqualified identifier can also be prefixed with the keyword var to indicate that the identifier is to be treated as a free variable symbol. Such inline variable declarations can be used to escape free variables (if the same identifier is also introduced as a bound variable on the left-hand side of the equation) and to declare free variable symbols on the fly, see Local Variables, for details.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

6.2 Applications

Application is probably the most important single construct in the Q language. It allows you to apply one object (the "function") to another one (the "argument"). This construct is used to denote "function calls" such as sqrt 2 as well as "constructor terms" such as bin X T1 T2 which encode tree-like data structures. Also, as we will see in Built-In Operators, operators like +, -, etc. are merely syntactic sugar for applications.

As in other contemporary functional languages, application is a binary operation written simply as juxtaposition: X Y denotes the application of X to Y, both of which may be arbitrary expressions themselves. Application associates to the left; the construct X Y1 … Yn is equivalent to (…((X Y1) Y2) …) Yn, and denotes the application of X to n arguments Y1, …, Yn. This style of writing function applications is commonly referred to as currying, after the American logician H.B. Curry. We will have to say more about this shortly.

Here are some valid examples of applications:

sqrt 2
sqrt (Y+1)
foo X (bar (Y-1)) Z

Note that since application is left-associative, nested applications in arguments must be set off with parentheses. For instance, foo (bar X) applies foo to bar X, whereas foo bar X applies foo to two arguments bar and X.

Since currying is so ubiquitous in functional programming, you should be well familiar with it, so let us briefly explain what it is, and what it is good for.

Functions of multiple arguments can generally be defined in two different ways:

As a function taking a single, structured argument (usually a tuple). This is the traditional, "uncurried" method one commonly encounters in mathematics. For instance, we might define the function max, which computes the maximum of two values, as follows:
max (X,Y) = X if X>Y; = Y otherwise;
Recursively, as a function which, when given the first argument, yields another function of the remaining arguments. Such functions are called curried. For instance, here is a curried definition for the max function:
max X Y = X if X>Y; = Y otherwise;
Here max X Y is to be read as (max X) Y, meaning that by applying max to the first argument X, we obtain another function max X, which, when applied to the second argument Y, yields the maximum of X and Y. (Incidentally, this curried form is also the way that the max function is actually defined in the standard library. In fact, most built-in and standard library functions use the curried form as it is more flexible.)

The Q language supports both kinds of notations. Choosing between the two is more than just a matter of taste. Besides saving parentheses, curried functions provide a cheap way to derive new functions through "partial applications" of a multi-argument function. For instance, given the curried definition of max from above, max 0 can be used to denote the function computing the "nonnegative part" of its argument (which is the argument itself if it is nonnegative, and zero otherwise). This does not work with the uncurried definition since that definition requires us to specify both arguments of max in one chunk; instead, we would have to start defining the derived function from scratch.

Uncurried definitions also have their merits. In particular, they allow us to define variadic functions, i.e., functions which can handle a varying number of arguments. For instance, the following definition enables us to apply the max function to both pairs and triples:

max (X,Y)               = X if X>=Y;
                        = Y otherwise;
max (X,Y,Z)             = max (X,max (Y,Z))

In fact, in the Q language it is possible to define generic functions which apply to any number of arguments specified as a tuple. For instance, the following version of max handles tuples of any size at least 2:

max (X,Y)               = X if X>=Y;
                        = Y otherwise;
max (X,Y|Zs)            = max (X,max (Y|Zs));

(In the above example, the notation (…|Zs) is used to denote a tuple with "tail" Zs. This is explained in the following section.)

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

6.3 Lists, Streams and Tuples

The Q language provides three closely related general constructs for representing sequences of objects: lists, streams and tuples.

We start out with a discussion of the list data structure. The constant [] denotes the empty list. In general, a list consisting of elements X1, X2, …, Xn is denoted [X1,X2,…,Xn]. For instance, [a,b,c] consists of three elements (symbols) a, b and c. In contrast to statically typed languages like Haskell and ML, list members may have different types; e.g., [a,10,"a string"] is a perfectly well-formed list consisting of a symbol, a number and a string. It is also possible to have nested lists, as in [a,[b,c]] which consists of two elements, the symbol a and the list [b,c].

You can also have a trailing comma in a list, as in [a,b,c,]. The trailing comma is ignored, so the above is exactly the same as [a,b,c]. This makes it possible to format a list as follows, which lets you add new items at the end more easily:

def ITEMS = [
              "This is the first item",
              "This is the second item",
              …
              "This is the last item",
            ];

As in Prolog, lists are actually represented in a right-recursive fashion using a binary constructor [|] which takes as its arguments the head element of the list and the list of the remaining elements, called the tail of the list. Thus, e.g., [a,b,c] is simply a shorthand notation for [a|[b,c]], which denotes a list with head a and tail [b,c] (which is in turn a shorthand for [b|[c]], etc.). You can also mix both styles of notation; for instance, [a,b|[c,d]] is another way to represent the 4-element list [a,b,c,d].

Note that [a|[b,c]] is different from [a,[b,c]]: the former denotes a three-element list, while the latter is a two-element list whose second member happens to be a list itself. Also note that the [|] constructor can in fact be applied to any pair of values (the second value does not have to be a list); e.g., [a|b] is a perfectly well-formed expression (although the built-in length, indexing and concatenation operations described in String, List and Tuple Operators, will fail on such values).

Q also provides so-called enumerations to construct lists from a range of values. These usually take the form [X1,X2,…,Xn..Y] which is just syntactic sugar for the application enum [X1,X2,…,Xn] Y involving the built-in enum function, see Enumeration Types, for details. Syntactic sugar is also provided for list comprehensions, which are discussed in Conditionals and Comprehensions. Analogous constructs are also provided for streams and tuples.

Streams look exactly like lists except that they are written with curly braces. The difference between lists and streams is that the head and tail of a stream are not evaluated until their values are actually required during the course of a computation. For instance, you will find that if you enter an expression like {1+1,2+2,3+3} in the interpreter, then the value of that expression will be just the expression itself, because the embedded applications of the `+' operator are not evaluated:

==> {1+1,2+2,3+3}
{1+1,2+2,3+3}

In contrast, if you enter the same sequence as a list, its elements will be evaluated immediately:

==> [1+1,2+2,3+3]
[2,4,6]

A stream element will be evaluated automatically when it is accessed, e.g., using the index operator:

==> {1+1,2+2,3+3}!1
4

This is also known as lazy or non-strict evaluation: the evaluation of the stream elements is deferred until they are actually needed. In fact, the entire tail of a stream may be deferred, which makes it possible to create infinite sequences whose members are produced "on the fly" while traversing the sequence.

Streams are implemented in terms of so-called "special forms" which will be introduced in Special Forms. We therefore postpone a closer description of this data structure until Special Constructors and Streams. Also note that the Q language itself only defines the stream constructors; most other stream functions are actually implemented by the standard library script stream.q, see Streams.

Tuples also work in much the same fashion as lists: The constant () denotes the empty tuple, and a tuple consisting of elements X1, X2, …, Xn is written as (X1,X2,…,Xn), which is equivalent to (X1|(X2|…|(Xn|()))), where the notation (X|Xs) denotes a tuple consisting of a first element X and the tuple Xs of the remaining elements. A trailing comma may follow the last tuple element, so that (a,b,c,) is the same as (a,b,c). The trailing comma is optional in all cases except one: As in the Python programming language, 1-tuples are indicated with a trailing comma, to distinguish them from ordinary parenthesized expressions. Hence (a,) denotes the 1-tuple with the single member a, while (a) just denotes a itself.

Note that in contrast to languages like ML and Haskell, tuples are not actually needed to represent sequences of values with different types, as Q's list data structure already allows you to do that. Also, Q's tuples are a lot more flexible than in ML and Haskell, as they can be handled mostly like list values. The big difference between lists and tuples in the Q language is that, while lists are always represented as recursive data objects using a binary constructor symbol (just the way that they are written), tuples are actually implemented as "vectors" which are stored as contiguous sequences in memory. (This only works for "well-formed" tuples. If the tail Xs of a tuple (X|Xs) is not a tuple, then this value can only be represented using an explicit application of the tuple constructor.) Therefore tuples normally use up much less memory than lists of the same size, and they also allow constant-time access to their members. The size of a tuple can be determined in constant time as well. In contrast, the same operations, when applied to a list, require time proportional to the size of the list. On the other hand, lists are more efficient when accessing the tail of a list and when a new element is prepended to a list, which can both be done in constant time. Here a tuple needs time proportional to its size, since the member sequence of the original tuple must be copied when accessing its tail or when prepending an element.

These tradeoffs should be carefully considered when deciding whether to implement a given sequence as a list or a tuple. Tuples are usually the best choice for implementing fixed sequences requiring fast random access to its individual members, whereas lists provide the most efficient representation if the elements are traversed and manipulated sequentially. If necessary, you can easily convert between these two representations using the built-in functions list and tuple, see Conversion Functions.

Lists, streams and tuples are frequently combined to create nested structures like a tuple of lists and/or streams, a list (or stream) of lists, or a list (or stream, or tuple) of tuples. The latter construct (sequences of tuples) is so common that Q provides special support for it in the form of the following "grouping" syntax. The notation [X1, X2, …; Y1, Y2, …; …] is a shorthand for [(X1, X2, …), (Y1, Y2, …), …]. Note that there must be at least one element in each group (but the number of elements in different groups may vary), and there must be at least one group (a single group is denoted with a trailing `;'). Thus, for instance, [A,B;C,D;X,Y,Z] is the same as [(A,B),(C,D),(X,Y,Z)], and [X,Y,Z;] is the same as [(X,Y,Z)]. The same notation also applies to streams and tuples.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

6.4 Built-In Operators

Besides the forms of expressions already discussed above, the Q language also provides a collection of prefix, infix and mixfix operator symbols for common arithmetic, logical, relational and other operations. A complete list of these symbols is given below, in order of decreasing precedence:

' ` ~ &: quotation operators (unary)
.: function composition
^ !: exponentiation and subscript
- # not: prefix operators (unary)
* / div mod and and-then: multiplication operators
++ + - or or-else: addition operators
< > = <= >= <> ==: relational operators
$: infix application operator
if-then-else: conditional expressions
||: sequence operator
\X.Y: lambda abstractions

Most of these symbols have their usual meaning; a closer description follows below. All binary operators are left-associative, with the exception of ^, ! and $ which associate to the right, and the relational operators which are nonassociative. Application takes precedence over all these operations except the quotation operators; hence sqrt X^3 denotes (sqrt X)^3 and not sqrt (X^3). The quotation operators have the highest possible precedence, and hence bind stronger than even applications. Parentheses are used to group expressions and overwrite default precedences and associativity as usual. C programmers will also note that the logical operators have the same "wrong" precedence as in Pascal. Thus you should make sure that you always parenthesize relational expressions when combining them with logical connectives.

You should also note that unary minus must be parenthesized when appearing in an argument of a function application. For instance, although foo X -Y is a perfectly well-formed expression, it denotes (foo X) - Y and not foo X (-Y) as you might have intended by the spacing which is however ignored by the Q compiler. As already noted in Lexical Matters, unary minus in front of a number is special; it causes values such as -2 to be interpreted as negative numbers rather than denoting an explicit application of the unary minus operator (an explicit application of unary minus can be denoted using the built-in minus symbol; see below). The rules for parenthesizing negative numbers are the same as those for unary minus; you have to write foo X (-2) instead of foo X -2 (which denotes (foo X) - 2).

In the Q language, expressions involving prefix and infix operators are merely syntactic sugar for applications. By enclosing such an operator in parentheses, you can turn it into an ordinary prefix function. For instance, X+Y is exactly the same as (+) X Y, and not X actually denotes the application (not) X. The built-in operators simply provide a more convenient way for applying these operations to their arguments, which resembles common mathematical notation. Moreover, you can also turn a binary operator into a unary function by specifying either the left or the right operand. E.g., (1/) denotes the reciprocal and (*2) the doubling function. Such constructs are commonly called operator sections. Inside a section, the usual precedence and associativity rules apply. For instance, the X+3 subterm in (*(X+3)) must be parenthesized because * has a higher precedence than +, and hence the partial expression (*X+3) is not well-formed.

The - operator plays a somewhat awkward role in the syntax, since it is used to denote both unary and binary minus. Q adopts the convention that the notation (-) always denotes binary minus; unary minus may be denoted by the built-in minus function. Thus the expression minus X applies unary minus to X. Note that this construct always denotes an explicit application of the unary minus operation. For instance, minus 5 denotes the application of unary minus to the integer 5, while -5 is a negative integer. Also note that the construct (-X) is not an operator section, but a parenthesized expression involving unary minus. The easiest way to construct an operator section which subtracts a given value from its argument is to formulate the function using the addition operator as in (+(-X)).

Another special case are the mixfix operators if-then-else (conditional expression) and \X.Y (lambda abstraction). These are just syntactic sugar for the standard library functions ifelse/when and the built-in function lambda, so no special syntax is needed to turn them into prefix functions.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

6.4.1 Quotation Operators

The ' (quote), ` (backquote), ~ (tilde) and & (ampersand) operators are used to deal with so-called special forms. The quote operator quotes an expression as a literal; it is a constructor symbol and hence becomes part of the quoted expression. The backquote and tilde operators are used to "splice" and "force" subterms in an expression. The ampersand operator causes "special arguments" to be memoized. We postpone a discussion of these operations until Special Forms.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

6.4.2 Function Composition

The `.' operator denotes function composition: (F.G) X = F (G X). For instance, to double a value and then add 1 to it, you can use an expression like ((+1).(*2)) X. Note the parentheses around the composition - as application binds stronger than composition, F.G X is interpreted as F.(G X) rather than (F.G) X. If a composition involves applications with integer arguments, extra whitespace may be needed to distinguish between the `.' operator and the decimal point; e.g., F 5 . G denotes a composition of the functions F 5 and G, whereas F 5.G is an application of the function F to two arguments, the floating point value 5. and the function G. Also note that the composition operator is associative, since (F.G.H) X = ((F.G).H) X = F (G (H X)) = (F.(G.H)) X.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

6.4.3 Arithmetic Operators

The operators +, -, *, / denote the sum, the difference, the product and the quotient of two numeric values, respectively. As already noted, - is also used as unary minus. The operators div and mod denote integer quotient and modulo, respectively; they only apply to integers. The ^ operator denotes exponentiation, as defined by X^Y = exp (ln X*Y); it always returns a floating point value. (If X<0 then X^Y is defined only if Y is an integer. 0^0 is left undefined as well, so if you would like to have that 0^0 = 1 then you must add corresponding equations yourself. Also note that the complex.q standard library module extends the built-in definition of the exponentiation operator to handle the case that X<0 with general exponent; see Complex Numbers.)

The argument and corresponding result types of these operations are summarized in the following table (Int denotes integer, Float floating point, and Real integer or floating point values):

+ - *: Int Int -> Int
Int Float -> Float
Float Int -> Float
Float Float -> Float
/ ^: Real Real -> Float
div mod: Int Int -> Int
- (unary): Int -> Int
Float -> Float

Note that, as of Q 7.2, all floating point operations properly deal with IEEE floating point infinities and NaN ("not a number") values. Consequently, division by zero now yields one of the special values inf, -inf or nan, depending on the value of the first operand. (Of course, this will only work as described on systems with a proper IEEE floating point implementation. The interpreter makes no attempt to emulate IEEE floating point values on systems which do not provide them natively.)

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

6.4.4 Relational Operators

The operators <, >, =, <=, >=, <> are binary predicates meaning "less", "greater", "equal", "less or equal", "greater or equal" and "not equal", respectively. The built-in definition of these operations only applies to numbers, strings and truth values. All relational operators return truth values (true, false) as results. Strings are compared lexicographically, on the basis of the character encoding (which, as of Q 7.0, usually is Unicode). Truth values are ordered by false < true.

If you would like to compare other types of values than the basic objects mentioned above, normally you will have to provide suitable definitions yourself. For instance, you might wish to extend the equality operation to other built-in and user-defined data structures such as lists, trees, etc., by "overloading" the = operator accordingly. The following equations implement an equality predicate on lists (the parentheses on the left-hand side are necessary to prevent the equality operation to be interpreted as the equal sign separating left-hand and right-hand side of an equation):

([] = [])               = true;
([] = [Y|Ys])           = false;
([X|Xs] = [])           = false;
([X|Xs] = [Y|Ys])       = (X=Y) and then (Xs=Ys);

Rules for other comparison operators (<>, <=, etc.) could be added in the same fashion. Actually, the standard library provides corresponding definitions; see The Standard Library.

A special case arises with types consisting of only nullary constructor symbols declared with const, so-called enumeration types, see Types. The values of such a type can always be compared with the relational operators, using the order in which they are listed in the type declaration. (A special case of this is the order of the built-in truth values.) For instance, assume that the Day type is declared as follows:

type Day = const sun, mon, tue, wed, thu, fri, sat;

Then the listed constants will be ordered as sun < mon < … < sat.

Besides the equality predicate, the Q language also provides a notion of "syntactic" equality, implemented by the `==' operator, which applies to all kinds of expressions; see Non-Linear Equations, for details.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

6.4.5 Logical and Bit Operators

The logical operations not, and, or denote logical negation, conjunction and disjunction, respectively. These operators take truth values as their arguments. They are defined in a straightforward manner:

not true                = false;
not false               = true;

true and true           = true;
true and false          = false;
false and true          = false;
false and false         = false;

true or true            = true;
true or false           = true;
false or true           = true;
false or false          = false;

Like most other programming languages, Q also has logical connectives for the short-circuit evaluation of logical expressions, which are denoted by the operators and then and or else. These operations are actually implemented as "special forms" which evaluate their arguments from left to right only as far as required to determine the value of the expression (cf. Special Forms). They are defined by the following built-in equations:

true and then X         = X;
false and then X        = false;

false or else X         = X;
true or else X          = true;

The first operand is always evaluated. Depending on its value, the second operand may not be evaluated at all. For instance, if X evaluates to false, then X and then Y immediately reduces to false, without ever having to evaluate the second argument Y. On the other hand, if X evaluates to true, then Y is evaluated and returned as the value of the expression. The or else operation works analogously.

One reason for using short-circuit evaluation is efficiency: prevent unnecessary computations when an initial part of a logical expression already determines the value of the entire expression. Furthermore, short-circuit evaluation is sometimes essential to check a condition before evaluating an expression depending on the validity of this condition. For instance:

(X <> 0) and then (foo (1/X) > 0)

You should also note that, according to the above definitions, X and then Y and X or else Y are always reduced if X is a truth value, even if the second operand Y does not evaluate to a truth value. This may look a bit weird at first, but it is in fact the most reasonable way to implement short-circuit evaluation in the Q language, since Q uses dynamic typing, and hence the type of an expression is only known after it has been evaluated. In fact, this "feature" can be quite useful at times. For instance, you can also use and then to write down a simple kind of conditional expression like

check X and then bar X

where bar X is the value you wish to compute, but only if the condition check X holds.

The Q interpreter also uses the operators not, and and or to denote bitwise logical operations on integers. Thus, not X denotes the one's complement of an integer X, and X and Y and X or Y the bitwise logical conjunction and disjunction of integers X and Y, respectively. These operations behave as if negative integers were represented in two's complement (although GMP actually uses a sign-magnitude representation). This means that for each integer X, -X = not X + 1, or, equivalently, not X = -X-1. For instance:

==> 17 and not 13
16

==> 17 or not 13
-13

==> not _
12

These results become clear when considering that 17 has bits 0 (=1) and 4 (=16) on (and all other bits off), while not 13 has bits 0 (=1), 2 (=4) and 3 (=8) off (and all other bits on). Thus, writing these numbers as unsigned bit sequences with most significant bits first, where ...1 denotes an infinite sequence of leading 1's, we have:

17 and not 13 =
10001 and not 1101 = 10001 and ...10010 = 10000
= 16

17 or not 13 =
10001 or not 1101 = 10001 or ...10010 = ...10011 (= not 1100 = not 12)
= -13

Together with the built-in bit shift operations, the bitwise logical operations can also be used to implement "bitsets", i.e., sets of nonnegative integers represented by bit patterns. See Arithmetic Functions, for details.

In case you are wondering about "exclusive or:" While this operator is not provided as a builtin, you can easily define it yourself as follows:

public (xor) X Y @ 3;
X xor Y                 = (X or Y) and not (X and Y);

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

6.4.6 String, List and Tuple Operators

The ++ operator denotes concatenation, # is the length or size operator, and the subscript operator ! is used for indexing. These operations work consistently on strings, lists and tuples. For instance:

"abc"++"xy"             ⇒ "abcxy"
[a,b,c]++[x,y]          ⇒ [a,b,c,x,y]
(a,b,c)++(x,y)          ⇒ (a,b,c,x,y)

#"abc"                  ⇒ 3
#[a,b,c]                ⇒ 3
#(a,b,c)                ⇒ 3

"abc"!1                 ⇒ "b"
[a,b,c]!1               ⇒ b
(a,b,c)!1               ⇒ b

Indexing with the subscript operator starts at zero, so that X!0 and X!(#X-1) denote the first and last member of a string, list or tuple, respectively. Also note that since ! is right-associative, double subscripts may have to be parenthesized. For instance, compare (X!I)!J and X!I!J=X!(I!J).

All list and tuple operators check that their first argument is a well-formed list or tuple value. However, the second argument of the ++ operator may in fact be any value. List concatenation just replaces the empty sublist marking the end of its first list argument with the second argument, as if it was defined by the following equations:

[]++Ys                  = Ys;
[X|Xs]++Ys              = [X|Xs++Ys];

Hence we have that, e.g., []++1 ⇒ 1 and [1,2]++3 ⇒ [1,2|3], which may look odd, but is consistent with the above equations. Tuple concatenation works in an analogous manner.

All operators discussed in this section are also available for streams (cf. Lists, Streams and Tuples). However, the corresponding definitions are not built into the language but are provided in the standard library script stream.q, see Streams.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

6.4.7 Application and Sequence Operators

The application operator `$' and the sequence operator `||' have the lowest precedence among the prefix and infix operators. The $ operator is right-associative and binds stronger than the || operator, which is left-associative.

The infix application operator provides an alternative way to denote function application, i.e., F $ X ⇒ F X. This operator comes in handy when you want to pass around function application as a function parameter. Since this operator has very low precedence and is right-associative, it also provides a convenient way to write down "cascading function applications," such as foo $ bar $ X+Y = foo (bar (X+Y)), which can save a lot of parentheses and make an expression more readable.

The sequence operator lets you evaluate sequences of expressions in a given order. The value of the sequence is given by the rightmost expression. That is,

X || Y                  ⇒ Y.

The sole purpose of this construct is to allow operations with side-effects (such as I/O functions) to be carried out sequentially. A typical example is

writes "Input: " || reads

which prints a prompt on the terminal and then reads in a string. (The built-in functions writes and reads are described in Built-In Functions.)

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

6.4.8 Conditional Expressions and Lambdas

As of version 7.1, Q also provides special syntax for two other kinds of constructs which are commonly encountered in functional programs: conditional expressions and lambda abstractions. A conditional expression takes the form if X then Y else Z where X, Y and Z are arbitrary expressions. It returns the value of Y if the value of X is true, and the value of Z if the value of X is false. The `else' part can also be omitted, in which case () is returned if the value of X is false. Like the built-in logical connectives and then and or else, conditional expressions are special forms which are evaluated in short-circuit mode; thus, if X evaluates to false, then Y will never be evaluated.

Lambda abstractions can be denoted using the customary notation \X.Y where X is an expression denoting a pattern to be matched against the argument of the lambda function, and Y is another expression, the lambda body which is to be evaluated when the lambda function is applied to an argument. Note that if the argument pattern X is not an atomic expression (i.e., not a basic expression, list, stream or tuple) then it must be parenthesized; the lambda body Y never has to be parenthesized, however, because of the low precedence of the lambda construct. Multi-argument lambdas can be written using the notation \X1 X2 … . Y, which is just a shorthand for \X1 . \X2 . … . Y. Lambdas are special forms; neither the parameter patterns nor the body are evaluated, rather they are "compiled" (at runtime) to a special kind of function object which can be applied to the actual arguments in an efficient manner, see Lambda Abstractions, for details.

Like the other operators, both if-then-else and the \X.Y construct are merely syntactic sugar for function applications. The if-then-else operator is actually implemented by the standard library function ifelse (or the when function if the `else' part is omitted), see Conditionals and Comprehensions. The \X.Y construct translates to an application of the built-in lambda function, see Lambda Abstractions.

Note that the \X.Y construct has the lowest possible precedence, even lower than the sequencing operator `||', so it always has to be parenthesized unless it forms a toplevel expression or the body of another lambda construct. The if-then-else operator binds stronger than `||', but weaker than `$', and "dangling else" parts are assumed to belong to the most recent if-then. This makes imperative-style code like the following behave as expected:

test = \X.
       writes "The value is " ||
       if X > 0 then
         writes "positive.\n"
       else if X < 0 then
         writes "negative.\n"
       else
         writes "zero.\n";

Also note that if-then-else needs to be parenthesized unless it occurs as a toplevel or inside a lambda, sequence or another conditional expression:

test = \X. writes $ "The value is " ++
       (if X > 0 then
          "positive"
        else if X < 0 then
          "negative"
        else
          "zero") ++ ".\n";

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

6.5 User-Defined Operators

As of version 6.2, the Q language also lets you you define your own prefix and infix operator symbols, using a declaration like the following (see also Declarations):

public (myop) X Y;

This effectively turns myop into a keyword which can be used as an infix operator: X myop Y. User-defined operators work exactly like the built-in ones, in that you can turn them into a prefix function by enclosing them in parentheses, as in (myop), and that you can build sections from them which leave the left or right operand unspecified, such as (X myop) and (myop X).

By default, the new operator has the same precedence and associativity as the relational operators (which are at precedence level 2, and are non-associative). You can also explicitly specify a precedence level, like so:

public (myop) X Y @ 2;

Alternatively, you may denote the precedence level by listing an existing operator symbol. The following is equivalent to the above declaration:

public (myop) X Y @ (<);

As of Q 7.7, the operator symbol doesn't have to be an identifier, it can also be a sequence of punctuation characters as discussed in Operator Symbols. For instance:

public (===) X Y @ (=);

As a real-world example, here is the definition of an xor (exclusive-or) operator from Logical and Bit Operators. It uses the third precedence level, which is occupied by the addition operators (which also includes the built-in or operator):

public (xor) X Y @ 3;
X xor Y                 = (X or Y) and not (X and Y);

In general, a quick glance at the operator table in the preceding subsection reveals that valid precedence level numbers for the prefix and infix operators go from 0 (lowest precedence, binary sequence operator) to 9 (highest precedence, unary quotation operators), with 8 denoting the precedence of function application. (Note that there are no numbered precedence levels for the if-then-else and \X.Y constructs because there is no way to declare such mixfix operators in Q.) The complete list of built-in prefix and infix operators, cast as a sequence of valid Q declarations, is shown below.

public (::||) X Y @0, (::$) X Y @1;
public (::<) X Y @2, (::<=) X Y @2, (::>) X Y @2, (::>=) X Y @2,
  (::=) X Y @2, (::<>) X Y @2;
public special (::==) X Y @2;
public (::++) X Y @3, (::+) X Y @3, (::-) X Y @3,
  (::or) X Y @3;
public special (::or else) ~X Y @3;
public (::*) X Y @4, (::/) X Y @4, (::div) X Y @4, (::mod) X Y @4,
  (::and) X Y @4;
public special (::and then) ~X Y @4;
public (::#) X @5, (::not) X @5;
public (::^) X Y @6, (::!) X Y @6;
public (::.) X Y @7;
public special const (::') X @9;
public (::`) X @9, (::~) X @9, (::&) X @9;

Note that precedence level 8 may not be used for user-defined operators. Other than that, all precedence levels are permitted, but the fixity (prefix/infix), the number of operands (unary/binary) and (unlike in ML and Haskell) also the associativity of operators (non-/left-/right-associative) are all determined by Q's system of built-in operators, as described in Built-In Operators.

Caveats

While user-defined operators are often useful for extending Q's built-in operator system and to implement "domain-specific languages", the excessive use of obscure operator symbols, especially the use of multi-character punctuation looking like "line noise", may make your programs hard to read and is considered bad style.

As explained in Operator Symbols, some built-in operator symbols, such as `-', `=' and `.' are actually part of Q's syntax and need special treatment. These symbols may therefore not be redeclared as user-defined operators. The same applies to special symbols like `|', `..', `\', `@' and the keywords of the language. In general, you should also refrain from redeclaring other built-in operators since this will most likely be confusing for other programmers reading your code. Most of the time that you want to define, say, a `+' operation on a given data structure, extending the built-in `+' operator should work just fine; there's usually no need to declare a new `+' operator for that purpose.

You should also be aware that declaring new operator symbols actually amounts to changing the syntax of the Q language in small and sometimes subtle ways, both at compile- and runtime. The amount of havoc you can wreak on the syntax is somewhat limited by the fact that Q does not allow you to create your own precedence levels; all the available levels and the corresponding fixities of operators are prescribed by Q's expression syntax. Nevertheless, designing your own operator systems takes careful planning. In particular, since operator symbols are parsed using the maximal munch rule (see Operator Symbols), it is a bad idea to declare binary operator symbols like `--' or `<-' which end in a valid prefix of a unary operator (`-' in this case). Such ambiguities can break existing code and produce bugs which are hard to find. When in doubt, just use symbols from the extended Unicode character set; there are plenty of nice mathematical and other punctuation symbols in Unicode which are very useful for programming purposes.

[ << ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

This document was generated by Albert Gräf on February, 23 2008 using texi2html 1.76.