[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

8. Types

Q uses dynamic typing like, e.g., Lisp or Smalltalk, as opposed to statically typed languages such as Pascal, Eiffel or Haskell. In languages with static typing, a variable or function parameter can only hold a value which matches its prescribed type (which can be a "polymorphic" type in languages like Eiffel and Haskell, but still the type of the value is restricted). In dynamically typed languages, however, the actual type of a variable or function parameter value is not known in advance. Consequently, in Q it is only possible to distinguish different types of objects - such as search trees, queues, arrays and the like - by selecting an appropriate set of constructor symbols for each type of object. This chapter discusses Q's notion of type guards which allows you to make the assignment of constructors to a type explicit, and to use this information in order to restrict the applicability of equations to objects of a given type.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

8.1 Using Type Guards

As in any programming language, the careful design of application-dependent data structures is one of the main concerns when developing Q scripts which go beyond simple numeric, string and list processing. As a typical example for a non-list data structure which plays a prominent role in many Q scripts, let us consider binary search trees, which are a convenient means to implement sets, bags, dictionaries and the like. We will use this data structure as a running example throughout this chapter.

A typical choice of constructors for binary trees is the following:

 
public const nil, bin X T1 T2;

To make explicit the fact that nil and bin belong to the binary tree type, instead of the above declaration we use a type declaration like the following, which, besides declaring the constructor symbols, also introduces the type symbol BinTree:

 
public type BinTree = const nil, bin X T1 T2;

This declaration tells the interpreter that each expression of the form nil or bin X T1 T2 should be considered as a member of the BinTree type. This is also known as an algebraic type; the type declaration is essentially a signature of constructor symbols which are used to create the members of the type. The type symbol can then be used as a guard on variables on the left-hand side of equations and variable definitions to ensure that a variable is only matched to objects of the given type, see Type Guards. E.g., the following rule employs such a type guard in order to restrict the argument of the foo function to BinTree objects:

 
foo T:BinTree           = ...;

This makes it possible to avoid explicit argument patterns like

 
foo nil                 = ...;
foo (bin X T1 T2)       = ...;

in cases in which the component values are not actually required. This can simplify matters a lot, in particular if multiple arguments have to be matched to a given type. Also, it is more efficient than checking the type of an object in the qualifier part of a rule by using a user-defined predicate, since the interpreter can use the type information right away in the pattern matching process.

Another important reason for preferring type guards over explicit argument patterns is the issue of "information hiding". With the former definition of the foo function above we do not make any explicit reference to the constructor patterns making up the BinTree data type. This makes it possible to treat BinTree as an abstract data type (ADT) which hides the underlying implementation details (in particular the constructors), while still being able to verify that the proper kind of object is supplied as an argument. Any access to objects of the ADT will then be implemented by referring to the appropriate operations supplied by the type. In fact, we can make the constructors private symbols which are only accessible to the script which implements the BinTree type:

 
public type BinTree = private const nil, bin X T1 T2;

As a concrete example, let us assume the standard search tree operations insert T X and delete T X, which insert an element X into a tree T, and remove it from the tree, respectively. These operations can be implemented as follows (see [Bird/Wadler 1988]):

 
public insert T X, delete T X;
private join T1 T2, init T, last T;

insert nil Y            = bin Y nil nil;
insert (bin X T1 T2) Y  = bin X (insert T1 Y) T2 if X>Y;
                        = bin X T1 (insert T2 Y) if X<Y;
                        = bin Y T1 T2 if X=Y;

delete nil Y            = nil;
delete (bin X T1 T2) Y  = bin X (delete T1 Y) T2 if X>Y;
                        = bin X T1 (delete T2 Y) if X<Y;
                        = join T1 T2 if X=Y;
                        
join nil T2             = T2;
join T1 T2              = bin (last T1) (init T1) T2 otherwise;

init (bin X T1 nil)     = T1;
init (bin X T1 T2)      = bin X T1 (init T2) otherwise;

last (bin X T1 nil)     = X;
last (bin X T1 T2)      = last T2 otherwise;

(Note that the last and init operations return the last element of a binary tree, and a binary tree with the last element removed, respectively. The join, init and last functions are for internal use only, and can hence be implemented as private functions.)

Furthermore, we assume the following function bintree which constructs a binary tree from a list, and the function members which returns the list of elements stored in a tree (in ascending order):

 
public bintree Xs;
bintree Xs:List         = foldl insert nil Xs;

public members T;
members nil             = [];
members (bin X T1 T2)   = members T1 ++ [X|members T2];

(The definition of bintree employs the standard foldl operation, see The Standard Library.) We can use the interface operations of the BinTree ADT in order to implement the functions union and diff which add and remove all members of a tree to/from another tree:

 
public union T1 T2; // add elements of T2 to T1
union T1:BinTree T2:BinTree
                        = foldl insert T1 (members T2);

public diff T1 T2; // remove elements of T2 from T1
diff T1:BinTree T2:BinTree
                        = foldl delete T1 (members T2);

Observe that no explicit reference is made to the BinTree constructors; we only employ the public "interface" functions insert, delete and members of the BinTree ADT. This allows us to change the realization of the data structure without having to rewrite the definitions of union and diff. Still, the definitions of union and diff are "safe"; the BinTree type guard ensures that the arguments passed to union and diff are indeed BinTree objects capable of carrying out the requested operations.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

8.2 Built-In Types

Type guards are also the only way for verifying that the actual value of a variable belongs to one of the built-in types of integers, floating point numbers, strings, files and lambda functions, since there is no way for writing out all "constructors" for these kinds of objects - there are infinitely many (at least in theory). For this purpose, the type symbols Int, Float, String, File and Function are predefined. There also is a type named Char which denotes the single-character strings. Technically, Char is a subtype of the String type; more about that in Sub- and Supertypes. Moreover, Char is also treated as an enumeration type, see Enumeration Types, below.

Beginning with version 7.2, Q also offers a more elaborate "numeric tower" which makes it easier to integrate user-defined number types. There is a type Real which is a subtype of the Num type and the supertype of both Int and Float. Moreover, the standard library defines the complex number type Complex which is a subtype of Num, as well as the rational number type Rational which is a subtype of Real, see The Standard Library.

The built-in List type matches all list expressions of the form [] or [X|Xs]. This type is used to restrict the applicability of an equation to list arguments. For instance, the following equations define a function reverse which reverses a list:

 
reverse Xs:List         = foldl push [] Xs;
push Xs:List X          = [X|Xs];

The Stream and Tuple types are completely analogous: they match streams and tuples of arbitrary sizes, i.e., expressions of the form {} and {X|Xs} or () and (X|Xs), respectively.

The predefined Bool type is a means to refer to objects which are truth values. Its built-in definition is as follows:

 
public type Bool = const false, true;

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

8.3 Enumeration Types

Types like the built-in Bool type, which only consist of nullary const symbols, are also called enumeration types. You can easily define such types yourself, e.g.:

 
type Day = const sun, mon, tue, wed, thu, fri, sat;

The Q language provides special support for enumeration types (including the built-in Bool type, and also the Char type), by means of the following operations:

Note that while there is no direct operation for converting ordinal numbers back to the corresponding members of a given type, you can easily accomplish this using arithmetic:

 
==> ord thu
4

==> sun+4
thu

Enumeration type arithmetic also provides a quick way to check whether two enumeration type members belong to the same type:

 
==> isint (tue-thu)
true

Here we employed the standard library function isint to determine whether the ordinal difference of the two values is actually an integer. This works because the `-' operator will fail if the two constants belong to different enumeration types.

Using "enumerations" (which are described in detail below), you can also list all enumeration type members starting at a given value as follows:

 
==> [sun..]
[sun,mon,tue,wed,thu,fri,sat]

Let us now take a more in-depth look at enumerations. The central operation to create an enumeration is the built-in enum function, which lists all members of an enumeration type in a given range:

 
==> enum mon fri
[mon,tue,wed,thu,fri]

The enum_from function works like enum, but only takes a single argument and lists all members of the type starting at the given value:

 
==> enum_from mon
[mon,tue,wed,thu,fri,sat]

Note that the default "step size" is 1. An arbitrary step size can be indicated by invoking enum or enum_from with two initial members in the first argument as follows:

 
==> enum [sun,tue] sat
[sun,tue,thu,sat]

==> enum [sat,fri] sun
[sat,fri,thu,wed,tue,mon,sun]

==> enum_from [sun,tue]
[sun,tue,thu,sat]

==> enum_from [sat,fri]
[sat,fri,thu,wed,tue,mon,sun]

The notation [X1,X2,…,Xn..Y] is provided as syntactic sugar for enum [X1,X2,…,Xn] Y, so you can also write the following:

 
==> [mon..fri]
[mon,tue,wed,thu,fri]

==> [sun,tue..sat]
[sun,tue,thu,sat]

==> [sat,fri..sun]
[sat,fri,thu,wed,tue,mon,sun]

Likewise, [X1,X2,…,Xn..] is the same as enum_from [X1,X2,…,Xn]:

 
==> [mon..]
[mon,tue,wed,thu,fri,sat]

==> [sun,tue..]
[sun,tue,thu,sat]

==> [sat,fri..]
[sat,fri,thu,wed,tue,mon,sun]

We have already mentioned that the built-in Char type is also an enumeration type, which consists of all the characters in the Unicode character set. Thus arithmetic on characters works as expected:

 
==> "a"+5
"f"

==> "z"-2
"x"

==> "8"-"0"
8

The enum operation also applies to characters accordingly:

 
==> enum "a" "k"
["a","b","c","d","e","f","g","h","i","j","k"]

==> ["a".."k"]
["a","b","c","d","e","f","g","h","i","j","k"]

==> ["k","j".."a"]
["k","j","i","h","g","f","e","d","c","b","a"]

The enum_from operation works with Char, too, but this is usually rather useless because the Unicode character set is fairly large and thus it is rather inefficient to construct a complete list of all characters starting at a given value. The right way to work with such huge character enumerations is to take advantage of lazy evaluation and use a stream enumeration instead, see below.

The standard prelude (see The Standard Library) extends succ, pred and enum (but not enum_from, for obvious reasons) to integers, which effectively turns the integers into an (albeit infinite) enumeration type:

 
==> succ 0
1

==> pred 0
-1

==> [0..9]
[0,1,2,3,4,5,6,7,8,9]

==> [0,2..9]
[0,2,4,6,8]

==> [9,8..0]
[9,8,7,6,5,4,3,2,1,0]

The standard prelude also implements enum (but not succ and pred) on floating point values:

 
==> [0.1,0.2..1.0]
[0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0]

Note that it is also possible to invoke enum or enum_from with more than two members in the first argument. That is, you can construct enumerations with more than two initial members. Neither the interpreter nor the library provides any definitions for these, so it is up to you to introduce your own interpretations of such constructs when you need them. For instance:

 
[X1,X2,X3..Y] = while (<=Y) (F*) X1 if (F>1) and then (Z3=X3)
                  where F:Int = X2 div X1, Z3:Int = X2*F;

This definition employs the standard library function while to construct an increasing geometric integer series if it detects a constant integer ratio between the initial three members:

 
==> [2,4,8..128]
[2,4,8,16,32,64,128]

Tuple enumerations work in exactly the same fashion as list enumerations. The corresponding operations are called tupleenum and tupleenum_from. The same kind of syntactic sugar is provided, the only difference being that ordinary parentheses are used instead of brackets. Example:

 
==> (9,8..0)
(9,8,7,6,5,4,3,2,1,0)

The standard prelude also provides enumerations for lazy lists a.k.a. streams, cf. Streams. The streamenum and streamenum_from functions work like enum and enum_from, respectively, but construct streams instead of lists. In the case of streamenum_from, the produced stream may be infinite. The Q language provides the same kind of syntactic sugar for stream enumerations as for lists, the only difference is that curly braces are used instead of brackets. For instance, {1..9} denotes a stream consisting of a finite arithmetic sequence, while {0..} is the infinite stream of all nonnegative integers.

Because of lazy evaluation, streamenum_from can often be used to work around the inefficiencies of enum_from when you have to deal with huge enumerations of characters. For instance, here is how we can use a stream enumeration to count the number of all alphabetic Unicode characters:

 
==> #filter isalpha {chr 1..}
90342

This will take a little while, but works o.k. because we never actually construct the complete list of all Unicode characters, as would be the case when doing the same with a list enumeration.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

8.4 Sub- and Supertypes

The Q programming language also provides a subtype concept similar to the notion of single inheritance in object-oriented programming languages such as Smalltalk. For instance, we can modify our declaration of the BinTree type (cf. Using Type Guards) in order to make BinTree a subtype of the supertype SearchTree as follows:

 
public type BinTree : SearchTree = private const nil, bin X T1 T2;

Quite often, supertypes are abstract types which do not provide their own set of constructor symbols, but are simply used to factor out common operations shared among several "concrete" types. For instance, the SearchTree type might have been declared simply as follows:

 
public type SearchTree;

Now variables of type SearchTree will also match objects of its subtype BinTree, as well as of any other subtype of SearchTree. We can turn the union and diff functions from Using Type Guards, into operations on the SearchTree type as follows:

 
union T1:SearchTree T2:SearchTree
                        = foldl insert T1 (members T2);

diff T1:SearchTree T2:SearchTree
                        = foldl delete T1 (members T2);

As the next step, we might introduce another type AVLTree providing the same interface operations insert, delete and members as the BinTree type, but implementing these operations in terms of AVL trees rather than simple binary trees. (AVL trees are a variant of binary search trees in which the trees are kept balanced, and thus logarithmic insertion and deletion times can be guaranteed.) If we make AVLTree another subtype of SearchTree, then the union and diff operations can be applied to AVLTree objects just as well as to BinTree's. In fact, the operations will even work if we mix argument types, e.g., provide a BinTree as the first argument of union and an AVLTree as the second! By these means, it is possible to define polymorphic operations which are applicable to several different types sharing the same (subset of) interface functions.

For the sake of concreteness, here is an implementation of the AVLTree type. The shl, shr, rol and ror functions implement the common AVL tree rotation operations which are used to keep the tree balanced; see [Bird/Wadler 1988] for details.

 
/* H denotes the height of a nonempty AVL tree */

public type AVLTree : SearchTree = private const anil, abin H X T1 T2;

public avltree Xs;

private mknode X T1 T2;
private join T1 T2, init T, last T, height T, slope T, rebal T;
private rol T, ror T, shl T, shr T;

avltree Xs:List         = foldl insert anil Xs;

members anil            = [];
members (abin H X T1 T2)
                        = members T1 ++ [X|members T2];

insert anil Y           = abin 1 Y anil anil;
insert (abin H X T1 T2) Y
                        = rebal (mknode X (insert T1 Y)) T2 if X>Y;
                        = rebal (mknode X T1 (insert T2 Y)) if X<Y;
                        = abin H Y T1 T2 if X=Y;

delete anil Y           = anil;
delete (abin H X T1 T2) Y
                        = rebal (mknode X (delete T1 Y) T2) if X>Y;
                        = rebal (mknode X T1 (delete T2 Y)) if X<Y;
                        = join T1 T2 if X=Y;

join anil T2            = T2;
join T1 T2              = rebal (mknode (last T1) (init T1) T2) otherwise;

init (abin H X T1 anil) = T1;
init (abin H X T1 T2)   = rebal (mknode X T1 (init T2)) otherwise;

last (abin H X T1 anil) = X;
last (abin H X T1 T2)   = last T2 otherwise;

/* mknode constructs a tree node, computing the height value */

mknode X T1 T2          = abin (max (height T1) (height T2) +1) X T1 T2;

/* height and slope compute the height and slope (difference between
   heights of the left and the right subtree), respectively */

height anil             = 0;
height (abin H T1 T2)   = H;

slope anil              = 0;
slope (abin H X T1 T2)  = height T1 - height T2;

/* rebal rebalances after single insertions and deletions */

rebal T                 = shl T if slope T = -2;
                        = shr T if slope T = 2;
                        = T otherwise;

/* rotation operations */

rol (abin H1 X1 T1 (abin H2 X2 T2 T3))
                        = mknode X2 (mknode X1 T1 T2) T3;
                        
ror (abin H1 X1 (abin H2 X2 T1 T2) T3)
                        = mknode X2 T1 (mknode X1 T2 T3);

shl (abin H X T1 T2)    = rol (mknode X T1 (ror T2)) if slope T2 = 1;
                        = rol (abin H X T1 T2) otherwise;

shr (abin H X T1 T2)    = ror (mknode X T1 (ror T2)) if slope T2 = -1;
                        = ror (abin H X T1 T2) otherwise;

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

8.5 Views

As of version 7.7, Q provides a simple but effective implementation of views, a device originally proposed by Wadler [Wadler 1987] which provides a means to equip abstract data types with concrete representations in terms of "virtual constructors", which can be used for pattern matching just like "real" constructors are used when dealing with concrete algebraic data types. In the Q language, views are also used for deciding syntactic equality of external types (see Non-Linear Equations) and, last but not least, for pretty-printing purposes.

The first ingredient of a view is a definition for the built-in special form view which returns a representation of a data object, usually in terms of a public "construction" function for objects of the type. For instance, given the BinTree and AVLTree types from the previous section, we might provide the following rules:

 
view T:BinTree          = '(bintree Xs) where Xs = members T;
view T:AVLTree          = '(avltree Xs) where Xs = members T;

Here we used the construction functions bintree and avltree to build a representation of binary and AVL trees in terms of their member lists. After adding these rules to the program in Sub- and Supertypes, the representation defined by these rules will be used whenever an object of these types is printed in the interpreter (as well as by the built-in str and write functions). For instance:

 
==> def T1 = avltree [17,5,26,5], T2 = bintree [8,17]

==> union T1 T2; diff T1 T2
avltree [5,5,8,17,17,26]
avltree [5,5,26]

Two things are worth noting here. First, the view function always has to return a quoted expression. A description of the quote constructor `'' can be found in Special Forms; essentially, it just protects its argument from being evaluated. The need for this becomes apparent when considering that views are just expressions which, when evaluated, would construct a value of the type for which the view is being defined; thus the expression returned by view has to be treated as a literal, and the `'' operator takes care of that. Second, if you take a look at the example above, you will notice that we defined the views in such a manner that they would actually reconstruct the original values if you would evaluate them from the command line. It goes without saying that this behaviour is highly recommended, although the interpreter does not enforce this condition.

The second ingredient of a view definition is the declaration of one or more virtual constructors for the type. (In fact this is only needed if you also want to be able to use the view for pattern matching.) In our example, it suffices to turn the construction functions used in the view, bintree and avltree, into virtual constructors of the BinTree and AVLTree types, respectively. To these ends, we remove the public declarations of bintree and avltree and modify the type declarations as follows:

 
public type BinTree : SearchTree = virtual bintree Xs
                                 | private const nil, bin X T1 T2;
public type AVLTree : SearchTree = virtual avltree Xs
                                 | private const anil, abin H X T1 T2;

This lets us use the virtual constructors in pattern-matching definitions just as if they were real constructors. The views of these objects will be constructed on the fly, as they are required during pattern matching. Note that since the virtual constructors, as before, are implemented as public functions, this will even work outside the module where the data types are defined. E.g.:

 
==> def T1 = avltree [17,5,26,5], T2 = bintree [8,17]

==> diff T1 T2
avltree [5,5,26]

==> def avltree [X,Y,Z] = diff T1 T2; (X,Y,Z)
(5,5,26)

Of course, we can also use these "virtual" patterns in equations:

 
test (avltree [X|_])    = X>0;

And they also work in lambdas:

 
==> (\(avltree [X|_]).X) (union T1 T2)
5

It is also possible to define views in a recursive fashion so that, e.g., the tree members are made available one at a time. For instance, we might want to create a list-like view which delivers the tree members in ascending order. Taking the BinTree data type as an example, this can be achieved as follows:

 
public type BinTree : SearchTree = virtual empty, cons X T
                                 | private const nil, bin X T1 T2;

empty                   = nil;
cons X T:BinTree        = insert T X;

private first T;

view nil                = 'empty;
view T:BinTree          = '(cons X T)
                            where X = first T, T = delete T X;

first (bin X nil _)     = X;
first (bin X T1 T2)     = first T1 otherwise;

public bintree Xs;

// ...

Note that the first function returns the smallest member in the tree. The view is defined in terms of two virtual constructors empty (which returns the empty tree) and cons (which takes a value and a tree as parameters and inserts the value into the tree). With these definitions we now have:

 
==> bintree [5,1,9,3]
cons 1 (cons 3 (cons 5 (cons 9 empty)))

==> def (cons X (cons Y _)) = _; (X,Y)
(1,3)

The main advantage of this approach is that, at least for the purpose of pattern matching, the view only needs to be constructed as far as required to extract the desired number of tree elements. In contrast, employing bintree as the virtual constructor, as in our first example above, is much more inefficient since the entire list of members will be computed whenever the view is needed in pattern matching.

Another important point worth mentioning here is that a virtual constructor does not really belong to the type, even though it is declared as one of its members. More precisely, only "proper" objects which were constructed with the real constructors of the type will ever match the corresponding type guard. Nevertheless, it is possible to have definitions like the following, which first try a couple of explicit virtual constructor patterns and then fall back to a guarded variable:

 
type Foo = virtual foo X, goo X | const bar X, baz X;

view (bar X)            = '(foo X);
view (baz X)            = '(goo X);

foo X                   = bar X;
goo X                   = baz X;

test (foo X)            = X;
test X:Goo              = X otherwise;

Note that in this case, if the default alternative is taken (i.e., the last equation for test), the constructed view is returned (after being evaluated) rather than the original expression. Of course this will lead to unexpected results if the view, when evaluated, does not reconstruct the original expression, hence our recommendation that Y ⇒ X if view X ⇒ 'Y.

Another word of caution: Definitions mixing patterns with virtual and nonvirtual constructors of a given type will not work in the current implementation. To avoid backtracking during pattern matching, a view will always be constructed for each parameter which is matched against at least one virtual constructor; in such a case the nonvirtual patterns are effectively disabled. The compiler will warn you about such situations when it is invoked with the -w option (see Using Q). If you get this "overlap between virtual and nonvirtual constructors" warning it means that you should rewrite your definition so that it uses either real or virtual constructors, but not both at the same time. That should be easy enough to do; usually you will use the real constructors inside the module which defines the data type, and virtual constructors otherwise.

Finally, note that views can also be used in exactly the same fashion with external types. The only difference here is that an external type does not have any real constructors and thus the view must be created using the operations provided for the type. For instance, consider an external type Bar which represents pairs of integer values constructed with an external function bar (cf. Writing a Module):

 
public extern type Bar = virtual extern bar I J;

Assume that we have another extern function bartuple which returns the contents of a Bar object as a pair of two integer values:

 
public extern bartuple B;

Now a suitable view for Bar can be defined as follows:

 
view B:Bar              = '(bar I J) where (I,J) = bartuple B;

[ << ] [ >> ]           [Top] [Contents] [Index] [ ? ]

This document was generated by Albert Gräf on February, 23 2008 using texi2html 1.76.