Digitized by the Internet Archive 
 in 2013 
 
 http://archive.org/details/translationofrel936daws 
 
.j/u,o 7 
 
 \ ^J^UIUCDCS-R-78-936 
 
 77^4 
 
 v 
 
 6 
 
 UILU-ENG 78 1729 
 
 A TRANSLATION OF RELATIONAL DATABASE 
 QUERY LANGUAGES INTO SIBYL 
 
 by 
 
 David W. Dawson 
 
 July 1978 
 
A Translation of Relational Database 
 Query Languages into SIBYL 
 
 by 
 
 David W. Dawson 
 
 Department of Computer Science 
 
 University of Illinois 
 
 Urbana, Illinois 
 
 July 1978 
 
for Mary and Richard 
 
CONTENTS 
 INTRODUCTION 
 
 VL 
 
 SQUARE 
 
 The metalanguage 2 
 SIBYL 5 
 Relational databases 6 
 The three languages: comparisons 7 
 Example database 8 
 
 lLPHA 
 
 10 
 
 Data 
 
 Basic Forms 
 
 Major Productions 
 
 Procedures 
 
 References 
 
 11 
 12 
 13 
 26 
 30 
 
 
 31 
 
 Data 
 
 Basic Forms 
 
 Major Productions 
 
 Procedures 
 
 References 
 
 33 
 34 
 36 
 47 
 51 
 
 .E 
 
 52 
 
 Data 
 
 Basic Forms 
 
 Set Expressions 
 
 Statements 
 
 Procedures 
 
 References 
 
 53 
 54 
 56 
 61 
 67 
 68 
 
 .USIONS 
 
 70 
 
 The Languages 
 Translation 
 SIBYL 
 Comments 
 
 70 
 72 
 74 
 75 
 
 BIBLIOGRAPHY 77 
 
INTRODUCTION 
 
 In the past few years about a dozen query languages have 
 been designed for use with relational data bases . This 
 report formally defines three of them — ALPHA [6] , VL [13] 
 and SQUARE [3] — by giving syntax-directed translations 
 into SIBYL, a language designed in part to define the semantics 
 of programming languages . The definitions of both syntax and 
 semantics are precise and formal, and in the presence of a 
 SIBYL interpreter can be used to implement the languages . 
 
 To fully understand the translations presented in the 
 next three chapters, the reader should understand relational 
 databases and have a working knowledge of SIBYL. (For the 
 latter, see [9] and [10] . ) Most readers will not wish to be 
 so thorough; by reading the Introduction and Conclusion they 
 can get a good impression of the contents of this report, 
 its purpose, and its findings. By skimming the definitions 
 and examining their complexity, the reader can investigate 
 the relative merits of the query languages and of the utility 
 of this definition scheme . 
 
 Five parts make up the remainder of this introduction. 
 They describe the metalanguage used to define the syntax- 
 directed translation, the SIBYL language, relational data- 
 bases, a broad comparison of the languages, and the example 
 database used in illustrating the query languages. 
 
The Metalanguage . 
 
 Although the metalanguage is supplemented by examples 
 and commentary, and although a few translations are given 
 informally by English descriptions, still the metalanguage 
 describes the language rigorously. Our metalanguage consists 
 of extended BNF productions, together with translations of 
 the right sides of productions into SIBYL code. Thus syntax 
 and semantics are both distinguished and appropriately 
 connected. 
 
 For example, suppose we want to define a tiny subset of 
 LISP (the PLUS operator) in terms of FORTRAN. We do this by 
 writing the syntax-directed translation schema 
 
 * Sum ■*■ ( PLUS num, nunu)^ ** ^num, + nun^ 
 
 * Num ■+ ( 0|1 ) + 
 
 Note that the two productions on the left form a context-free 
 grammar for our LISP subset. In particular, (PLUS 10 Oil) 
 is a vald sentence, which is translated to (10 + 022) . This 
 defines a translation function T, which is applied as 
 follows: 
 
 T ( ( PLUS 10 Oil) ) = ( T (10) + T (011) ) 
 
 = ( 10 + 011 ) 
 Object language sumbols, both of the query languages and 
 of the SIBYL code, are underlined (e.g., #, GET , if R. protected 
 = T ) . Normally query language letters are upper case and SIBYL 
 is lower case, but this is by no means inflexible, as the last 
 example above shows. 
 
Blanks are ignored; it is assumed that they are used 
 when necessary and are otherwise non-significant except 
 inside identifiers, and primitive values. 
 
 Nonterminals of the grammar, like other metasymbols, :re 
 never underlined, and are always written in lower case, in 
 possibly abbreviated form when they occur in the right side 
 of productions. On the left side of productions, they are 
 written out in full, with the first letter only capitalized, 
 and often preceded by the standard abbreviation. The pro- 
 duction symbol itself is '-*■'. For example, the VL non- 
 terminal 'Relational table condition' is abbreviated 'rtcond 1 
 On the right side of a production it appears as -*■ ... rtcond 
 while on the left it appears as ... rtcond: Relational table 
 condition -+■ . . . . 
 
 Several metasymbols (never underlined) make the grammar 
 more concise and readable. 
 
 -- denotes alternative productions, a -> b c is 
 equivalent to the two productions a -*■ b and a ■> c. 
 (Lower precedence than concatenation.) 
 ( ) -- groups forms together. 
 
 * -- or more occurrences of the preceding structure. 
 
 + -- 1 or more occurrences of the preceding structure. 
 
 ? -- or 1 occurrences of the preceding structure, 
 subscripts -- denote different occurrences of a nonterminal 
 
 which must be distinguished in the translation. 
 Our grammar is therefore quite similar to one popular variant 
 of extended BNF. 
 
Where formal definitions are not given (but could be 
 given if necessary) , English explanations are enclosed in 
 angle brackets. Example: 
 
 str: String -*• <standard string in quotes > 
 
 Finally, every production is preceded by '*', often fol- 
 lowed by an index number which helps to classify the produc- 
 tions . For example, the seven basic classes of sentence in a 
 language would be preceded by *1 . , *2. t ..., *1 . , while *2.1, 
 *2.2 and *2.3 would denote nonterminals used mainly in *2 . 
 
 An example of a simple production is the SQUARE 
 variable list : 
 
 * 1.1 varlist : Variable list ■*■ var (^ var)* 
 
 If X, Y and GEORGE are 'var's, then the following are 
 
 'varlist 's : 
 
 X 
 
 Y, GEORGE, X 
 
 X , X 
 
 Translations into SIBYL always follow their syntax 
 productions, and are always preceded by '**' . Nonterminals 
 are to be further translated, according to the appropriate 
 rules . The semantic rule corresponding to the SQUARE example 
 above is 
 
 ** (_ var (j_ var) * j)_ 
 In many places, as here, the exact correspondence between 
 symbols (the two instances of 'var' here) is made clear to 
 the reader by the context. (This informality is not serious.) 
 
 How do we translate the three examples above? The rule 
 for 'var' is 
 
* var: Variable -* freevar compvar 
 
 ** freevar (dom compvar @ -*• d; 
 
 ^compvar x d , ( ) ) 
 
 I 
 
 If X is in fact a 'compvar', then the first example (X) 
 translates to 
 
 (( dom X @ - d; 
 
 (X, d, ( ) ) 
 
 ) , ) 
 
 If the SIBYL translation is identical in form to the 
 right side of the production, then it is omitted. 
 
 To aid in understanding, many SIBYL constructions have 
 been grouped into procedures and listed at the ends of 
 chapters . 
 
 A good reference for syntax-directed translation is 
 Aho and Ullman [l] . It should be noted that our translation 
 scheme is not a simple syntax directedttranslation in the 
 sense of [1] . 
 
 SIBYL. 
 
 SIBYL is an experimental programming language with the 
 usual control structures (conditionals, loops, case state- 
 ments, procedures) and data structures. These include 
 
 numbers 5 -35 56.32 
 
 character strings 'ABC ' OLD MC " DONALD ' 
 lists (1, 2, 3) (A,B,3) ('A', 'B',) (5,) 
 records (A:5, B:6) (NAME: 'SMITHY', AGE: 15) 
 
 Since lists and records can contain lists and records 
 
 as values, complex data structures can be formed. A rich 
 
set of operators is provided, including arithmetic and logical 
 operators and operators for concatenating and decomposing 
 strings, lists, ana 1 records. SIBYL is designed to have 
 enough power to permit unusually compact definitions, and 
 to simulate the environment and structures of all other 
 programming languages . 
 
 One purpose of this report is to see how well SIBYL per- 
 forms its intended primary function of defining other 
 languages . 
 
 For detailed descriptions of SIBYL, see [9] and [10] . 
 
 Relational Databases. 
 
 A relational database is basically a collection of 
 tables; see the example database at the end of this introduc- 
 tion. Each table has a name, as does every column of the 
 tables. Tables are called "relations" or "relational tables"; 
 columns may be "domains", "attributes" (ALPHA), "descriptors" 
 (VL) , "fields", or "columns" (SQUARE); and the entries of the 
 tables are "tuples", or sometimes "rows". Certain domains 
 which uniquely determine the rest of a tuple may be singled 
 out as "key fields". 
 
 Language syntax usually doesn't specify: the exact form 
 of relations. We have chosen to let a tuple be a set of 
 SIBYL records, so that the value of the relation STAR is 
 
 ( ( S#:0l, SNAME: 'SUN' , POSITION: 202020 , SCLASS:*G'), 
 
 • 
 
 ( S#:57, SNAME: 'SIRIUS B', POSITION: 162722, SCLASS.-'F'), ) 
 
For discussions of relational databases, see {5], [8] 
 and [12] . 
 
 The Three Languages : Comparisons . 
 
 ALPHA, VL and SQUARE have several features in common: 
 (1) they are fairly new; (2) they are intended to be data 
 manipulation sublanguages of larger computer languages (and 
 in the case of VL, of a particular language, VL1) ; (3) each 
 provides facilities for creating, destroying and modifying 
 relations as well as retrieving data from them, but retrieval 
 would almost certainly be the most widely used. 
 
 On the other hand, there are many interesting differ- 
 ences as well. Only ALPHA has non- rudimentary protection 
 and access control features . The syntax of ALPHA is 
 deliberately not made precise, while BNF grammars appeared 
 early in the history of both VL and SQUARE. The SQUARE 
 language has an unusual 2-dimensional syntax, and alternative 
 syntaxes have appeared. The importance of key fields varies 
 from great (in ALPHA) to minimal (they are not mentioned in 
 SQUARE) . ALPHA is based on the precise but unwieldly 
 predicate calculus, while VL and SQUARE attempt to make 
 querying as easy as possible for unsophisticated users . 
 
 Several queries in these languages and others are com- 
 pared in [13] , while [ll] has a very long list of queries in 
 ALPHA, SQUARE and several other languages, but not VL. 
 
 We have omitted some language features, both deliber- 
 ately and (no doubt) accidentally. Details have been 
 
8 
 
 omitted which would add little to the explanation but greatly 
 complicate the implementation. For example, piped mode in 
 ALPHA is not implemented, and there is error checking only 
 in ALPHA. 
 
 Example Database . 
 
 We will use the following relational database to 
 illustrate the languages. Key fields are underlined. 
 
 STAR (S#, SNAME, POSITION, SCLASS) 
 
 PLANET ( S#, P# , PNAME PTYPE) 
 
 VEHICLE (V#, VCLASS, VNAME) 
 
 MISSION ( V#, S#, MDATE , MTYPE) 
 
 COLONY (C#, CNAME, S#, P#, CDATE) 
 
 STAR 
 
 S# 
 
 SNAME 
 
 01 SUN 
 
 10 RIGIL 
 
 11 
 
 12 PROXIMA 
 
 23 TAU CETI 
 
 56 SIRIUS A 
 
 57 SIRIUS B 
 
 POSITION SCLASS 
 
 202020 
 191816 
 191816 
 
 191816 
 262918 
 162722 
 162722 
 
 G 
 
 G 
 K 
 M 
 G 
 
 A 
 F 
 
 PLANET 
 S# P# 
 
 01 
 01 
 01 
 12 
 23 
 23 
 56 
 
 03 
 04 
 06 
 01 
 02 
 05 
 01 
 
 PNAME 
 
 EARTH 
 
 MARS 
 
 TITAN 
 
 GAGARIN 
 
 PLATEAU 
 
 KAMPEN 
 
 CYON 
 
 PTYPE 
 
 E 
 M 
 I 
 M 
 E 
 I 
 R 
 
 
 
VEHICLE 
 
 
 
 
 
 v# 
 
 VCLASS 
 
 
 VNAME 
 
 P078 
 
 PROBE 
 
 PIONEER 
 
 P323 
 
 PROBE 
 
 
 
 
 X2 
 
 PROBE 
 
 
 ZOND 
 
 
 C29 
 
 CARGO 
 
 
 
 
 S09 
 
 SCOUT 
 
 
 
 
 S12 
 
 STARSHIP 
 
 
 ENTERPRISE 
 
 S15 
 
 STARSHIP 
 
 
 SCORPIO 
 
 MISSION 
 
 
 
 
 
 v# 
 
 s# 
 
 MDATE 
 
 MTYPE 
 
 
 P078 
 
 01 
 
 1981 
 
 
 F 
 
 
 P215 
 
 10 
 
 2170 
 
 
 
 
 
 P215 
 
 11 
 
 2170 
 
 
 
 
 
 P215 
 
 12 
 
 2170 
 
 
 F 
 
 
 X2 
 
 23 
 
 2195 
 
 
 L 
 
 
 S09 
 
 56 
 
 2210 
 
 
 E 
 
 
 S09 
 
 57 
 
 2210 
 
 
 E 
 
 
 S12 
 
 23 
 
 2221 
 
 
 C 
 
 
 S15 
 
 12 
 
 2225 
 
 
 C 
 
 
 S15 
 
 56 
 
 2240 
 
 
 C 
 
 
 S15 
 
 57 
 
 2240 
 
 
 E 
 
 
 S12 
 
 56 
 
 2245 
 
 
 C 
 
 
 COLONY 
 
 
 
 
 
 c# 
 
 CNAME 
 
 s# 
 
 01 
 
 04 
 
 CDATE 
 
 02 
 
 LOWELL 
 
 
 2004 
 
 03 
 
 BELLONA 
 
 01 
 
 04 
 
 2010 
 
 04 
 
 LANG 
 
 
 01 
 
 06 
 
 2015 
 
 53 
 
 NEW EARTH 
 
 23 
 
 02 
 
 2221 
 
 55 
 
 FAROUT 
 
 
 12 
 
 01 
 
 2225 
 
 58 
 
 PRIME 
 
 
 56 
 
 01 
 
 2240 
 
 ■ 
 * 
 
 I 
 
 I 
 
 I 
 
 n 
 
DSL ALPHA 
 
 The best known and most widely cited relational data- 
 base query language is DSL ALPHA (Data Sub-Language ALPHA), 
 which was developed by E. F. Codd of IBM [6] . Among its 
 noteworthy features are: 
 
 1. Data can be both retrieved and modified. 
 
 2. Access to data is through "workspaces", which are 
 local to users, as opposed to the database proper, 
 with which workspaces provide communication. 
 
 3. Queries are based on the predicate calculus familiar 
 to mathematicians and logicians. The query language 
 is therefore very powerful and elegant, but quite 
 different from English. 
 
 4. ALPHA syntax has never been rigidly defined, but is 
 supposed to depend on the host language of which it 
 is a part. (Thus ALPHA within FORTRAN would look 
 different from ALPHA within COBOL) . The published 
 versions of ALPHA in fact differ in several ways 
 (see, for example, [6], [7] and [8]). We have based 
 the ALPHA of this chapter on that of a popular early 
 source, [6] . 
 
 There are a few differences between our ALPHA and that 
 of [6]: 
 
11 
 
 1. Codd provided for an option "MAYBE TOO" to handle 
 undefined fields in the database. We have not. 
 
 2. [6] distinguishes between "domain names," which 
 may occur twice in a relation, and "attribute 
 names," which are by definition distinct. For 
 example, a relation R might have fields (A,B,A,C). 
 The system would distinguish the two occurrences 
 of domain name A as two attributes, probably by 
 adding suffixes . We require domain names to be 
 distinct, so that for us, domain = attribute. 
 
 3. We have no piped mode, which permits a user to 
 move data in or out of a workspace tuple by 
 tuple. 
 
 4. ALPHA normally permits use of functions and con- 
 stants in target lists; see [6]. For simplicity's 
 sake we have ignored these. 
 
 Without piped mode, a workspace can be in any of six 
 states (see Fig. 1 on next page). 
 
 Data . There are four data types within ALPHA: relations, 
 workspaces, attributes /domains and tuple variables. In 
 this description, each is represented by a kind of record 
 with one value field (except for attributes) and several 
 fields for useful information. To the user, relations and 
 workspaces are both simply relational tables, but the 
 run-time information needed by the system is different . 
 
complete 
 HOLD 
 
 start UPDATE 
 
 12 
 
 T = ready to transmit P = put 
 R = ready to receive H = hold 
 G = get U = update 
 
 Figure 1 
 (after Fig. 2 in [6], p. 13) 
 
 Our implementation requires two special structures 
 a list 'tvars 1 of declared triple variables, and a list 
 'free' of free variables in qualifications. 
 
 Basic Forms 
 
 * alist: Attribute list * j(aname ( ^aname ) *)_ 
 
 ** J^aname ( ^aname ) * , ) 
 
 Examples . 1. (DATE) 
 
 2. (S#, DATE, MTYPE) 
 
 * aterm: Attribute term ■* rname^ aname 
 
 Examples : 1. PLANET. PNAME 
 2. VEHICLE. V# 
 
 rname 
 
 Relation name -► name 
 
13 
 
 ** if name . protected then Error (4) 
 
 else name fi; 
 A relation cannot be accessed if it is protected; for 
 example, while another user is updating it. 
 
 * aname: Attribute name ■*• name 
 
 * wname: Workspace name •> name 
 
 * tvar: Tuple variable ■* name 
 
 * name: Name ■* let name (let num # und) 
 
 The original ALPHA permits # and _ within names. 
 
 * let: Letter ■* A | B | C | ••*2|Z 
 
 * num: Numeral -*■ 0|lJ,2| ••• |9_ 
 
 * #: Pound sign ■*- <if available, #, else = N0> 
 
 * und: Underline ■*■ aBi 
 
 Major Productions 
 
 * o. 
 
 ALPHA 
 
 Statement ■*■ declare get hold alter 
 
 put update release destroy 
 
 * 1 
 
 declare 
 
 Declare '■* domst relst ranst 
 
 * 1.1 
 
 domst 
 
 Vc* 
 
 Domain statement ■+ DOMAIN aname hltype 
 (aname : (type : ' domain ' , 
 
 data L n ^- t yP e ^_ 
 
 relations: () )_ 
 
 )& env -*■ env; 
 
 * 1.1.1 hltype 
 
 Host language type -»- <data type available 
 
 in host language> 
 
* 1.2 
 
 relst : 
 
 ** 
 
 14 
 
 Relation statement -*■ RE> 'ON rname alist^ 
 
 Kii'f (a^ist^ aname) 
 ( (alistn = %)V =t> 
 Error (0) 
 
 / Subset ( (alist 2 1 (aname,)) .alist-i ) =» 
 Error (1) . 
 
 (rname : (type 
 
 attributes 
 
 key 
 
 tvar 
 
 'relation' , 
 
 alist 
 
 (alist^ 1 (aname , ) ) 
 (range: rname, 
 
 value : %) , 
 protected : F, 
 value j_Q )_ 
 
 )&env •» env; 
 
 rname .-»■> ( a list | .(§ relations) ; 
 
 ); 
 
 Attributes must be created by DOMAIN statements before 
 
 relations which use them can be defined. Those attributes 
 
 which make up a key must of course be included in the 
 
 attributes of the relation. The "tuple variable" rname^tvar 
 
 ranges over rname^value in quantified retrievals . 
 
 Examples . DOMAIN S# INTEGER (2) 
 
 DOMAIN SNAME CHARACTER (24) 
 DOMAIN POSITION INTEGER (6) 
 DOMAIN SCLASS CHARACTER (2) 
 RELATION STAR (S#, SNAME, POSITION, SCLASS) KEY S# 
 
 RELATION MISSION (V#,S#,DATE,MTYPE) KEY (V#,S#,DATE) 
 
)) 
 
 ' domain ' 
 
 'INTEGER(2)', 
 
 (STAR, MISSION,) )) 
 
 15 
 
 The environment now contains the following records, among 
 others : 
 
 (STAR ,:(type : 'relation 1 , 
 
 attributes : (S# , SNAME , POSITION , SCLASS , ) , 
 key :(S#,) 
 tvar : (range : STAR, 
 value : % ) , 
 protected : F 
 value 
 (S# : (type 
 data 
 
 relations 
 
 The empty relations STAR and MISSION can be filled by PUT 
 statements with records like 
 
 (V#:'P078*, S#: 01, DATE: '1981', MTYPE : 'F ! ). 
 
 *1.3 ranst: Range statement -* RANGE (rname |wname) 
 
 tvar (rsquant)? 
 ** _(tvar : (type : 'tvar', 
 
 range : (rname Iwname)^ 
 quant : ( ' SOME ' (j_ rsquant) ?)^_ 
 value : % ) 
 )& env ■» env; 
 
 Remove (t var , tvars) ; tvar -» tvars; 
 *1.3.1 rsquant: Range statement quantifier -> SOME | ALL 
 
 ** 'SOME' I 'ALL 1 
 
16 
 
 The tuple variables declared in RANGE statements are 
 used during retrievals as free and bound variables. Like 
 workspaces, they are local to each user. The user may 
 associate a quantifier with tuple variables at declaration 
 time; if none is explicitly mentioned, the existential quan- 
 tifier ('SOME') is assumed. For details of their use, see 
 rule 2 below. 
 
 Examples . 1. RANGE STAR X 
 
 2. RANGE PLANET P ALL 
 
 3. RANCE STAR X SOME 
 
 Tables are retrieved from the database into workspaces 
 by GET or HOLD statements . The former merely presents the 
 data for inspection; the latter enables the user to alter 
 the data through host language statements. 
 
 * 2. get: Get ■+ GET (rname |wname) 
 
 [ GET (_(quota)_)? wname 
 
 tlist( 2. qexp)? (ord+)? 
 ** Display ( ( rname | wname ) )j_ 
 
 ( wname = % =» 
 
 _(wname : (type : 'workspace ' , 
 
 targetlist : tlist 
 
 sources : Rectify ( tlist @ 1) , 
 
 tvar : (range: wname x 
 
 value : %) , 
 state : 'RT', 
 
 value : () ) 
 
17 
 
 ) & env -*• env 
 
 h. 
 
 _( wname . state/='RT' =» Error (5) 
 ± wname . state <- 'busy G'; 
 
 wname .value <- Retrieve (t list 
 J^Oqexp)?]^ 
 
 ( Sort 1 (w name . value , (o rd+ ) ) ; ) ? 
 (wname .value ♦- (w name .value) . @(1 . . quota) ; ) ? 
 Display (w name) ; 
 wname . state ♦- 'RT 1 ; 
 ). 
 
 Productions for quota, targlist, qexp and ord follow the 
 examples . 
 
 Examples . 1. GET STAR 
 
 In this simplest form of a GET there is no work- 
 space; the contents of STAR are simply displayed in table 
 form on the output file. 
 
 2. GET W STAR 
 
 The same, except that now the tuples in STAR are 
 copied into a workspace W. 
 
 3. GET W (STAR.SNAME, STAR. POSITION) 
 
 Only two fields of STAR are retrieved. The SIBYL 
 execution environment now contains this record: 
 (w: (type : 'workspace ' , 
 
 targetlist: ((STAR, SNAME,), (STAR, POSITION,),), 
 sources :(STAR,), 
 
18 
 
 tvar : (range: W, 
 
 value: 7o) , 
 state : 'RT', 
 value : ( (SNAME: 'SUN', POSITION: 202020), 
 
 (SNAME: 'SIRIUS B 1 .POSITION: 162722) ,) )) 
 
 4. GET W (4) STAR 
 
 With the quota, only the first four tuples are 
 retrieved into W. 
 
 5. GET W (STAR. SNAME, STAR. POSITION): STAR.SCLASS = 'G\ 
 Retrieval with qualification. "Find the names and 
 
 positions of all G class stars." The printed result is 
 
 W 
 
 NAME POSITION 
 
 SUN" 202020 
 
 RIGIL 191816 
 
 TAU CETI 262918 
 
 The qualification ("predicate" in [5] , "quantified 
 
 expression" (qexp) in these notes) admits the full power 
 
 of the predicate calculus. 
 
 6. RANGE STAR X 
 
 GET W (PLANET. PNAME) :3 X(X. S# = PLANET. S# 
 
 AX.SCLASS = 'G') 
 
 Or, 
 
 RANGE STAR X SOME 
 
 RANGE PLANET P 
 
 GET W (P.PNAME): X.S# = P.S# A X.SCLASS = 'G' 
 
 "Find the names of all planets around G class stars;" 
 
 or, "Find the names of all planets such that there is a 
 
 star X for which the planet and X have the same star 
 
 number and the class of X is 'G'." The second query 
 
19 
 
 uses a default quantifier and a tuple variable within the 
 target list . 
 7. GET W STAR UP POSITION DOWN S# 
 
 The contents of W are sorted, primarily by ascending 
 position, secondarily by descending S#. 
 
 *2. 1 quota: Quota ■* <integer expression in host language> 
 
 *2.2 tlist: Target list -*■ target |_(target (^target) *)_ 
 
 ** (target,) 1 (target (.target)*,) 
 
 *2.2.1 target: Target -* 
 
 rname^aname ** (rname . tvar , a name) 
 | wname^aname ** (wname . tvar , aname) 
 | tvar ^aname ** j( tvar^aname)_ 
 rname ** (rname . tvar,) .& ( rname . at tributes )@ 
 wname ** (wname . tvar , ) .&( (w name /. target list) 
 
 @2)@ 
 tvar ** (tvar , ) . & (t var ,. range . attributes )@ 
 A qualified target is converted into a pair, the first 
 entry of which is a tuple variable. If there is no attribute 
 qualifier, all the tuples are. wanted, so a list of all possible 
 pairs is generated. 
 
 *2.3 ord: Ordering -> (UP | DOWN ) rname ^ aname 
 
 ** (( 'UP' | 'DOWN' ), aname )^ 
 
 *2.4 qexp: Quantified expression •> 
 
20 
 
 ** 
 
 (s5 | V tvar (s5l \ 3 tvar £s5^) 
 s5 (Remove (t var , free); 
 
 tvar . range@ -*• tvarj_ s5 ) A 
 (Remove ( t var , free); 
 
 tvar . range @ + tvarj_ s5 ) V 
 
 * 
 
 s5 
 
 ->• 
 
 s4 
 
 s5 _** s4 
 
 ** 
 
 s5 = s4 
 
 * 
 
 s4 
 
 -> 
 
 s3 
 
 s4 j± s3 
 
 ** 
 
 s4 <^ s3 
 
 * 
 
 s3 
 
 -v 
 
 s2 
 
 s3 V. s2 
 
 ** 
 
 s3 V_ s2 
 
 * 
 
 s2 
 
 -*■ 
 
 si 
 
 s2 A_ si 
 
 ** 
 
 s2 _A_ si 
 
 * 
 
 si 
 
 -> 
 
 sO 
 
 ^2 sO 
 
 ** 
 
 / sO 
 
 
 sO 
 
 -*■ 
 
 terr 
 
 [i pred term 
 
 <logical expres 
 
 
 
 
 host language> 
 
 
 
 * term -*■ rname^aname 
 wname^aname 
 tvar . aname 
 
 ** (rname . tvar .value) . aname 
 ** (wname . tvar . value) . aname 
 
 ** £tvar ■»> free ; 
 <constant in host (tvar . value) . aname) 
 language > 
 Various alternatives are in use for the logical connec- 
 tives of s5 through si. Syntactically, the language of qexp 
 down to sO and the predicate calculus are identical. 
 
 The list 'free' keeps track of free tuple variables. 
 Whenever one is used in a term it is appended to 'free 1 ; when 
 it is quantified it is removed. When all the qexp has been 
 generated and the variables in the target list are taken into 
 account , the remaining free variables are quantified by the 
 procedure Quantfree. 
 
21 
 
 *3. hold: Hold ■+ HOLD wname tlist (j_qexp) 
 ** _£ wname = % =» 
 
 (wname : (type : 'workspace' , 
 
 targetlist : tlistj_ 
 
 sources : Rectify( tlist @ 1) , 
 
 tvar : (range : wname^ 
 
 value ; %) , 
 
 state : 'RT', 
 
 value : ( ) )_ 
 
 ) & env ■> env 
 
 ( len (w name . sources)/ =1 =» 
 Error (3) 
 
 j_ wname . state/ = 'RT' =» " 
 Error (5) 
 j_ wname . state «- 'busy H'; 
 
 (wname . sources (1)) .protected 4- T; 
 wname . value <- Retrieve (t 1 i s tj_ 
 
 (CT(;qexp)? l )); 
 wname . state +- 'TU'; 
 
 _2i 
 
 The HOLD statement, like the GET, retrieves tuples from 
 the database into a workspace, but the workspace is not dis- 
 played and the HOLD makes it possible to change the permanent 
 data through subsequent alteration and UPDATE. 
 
 Examples . 1 . HOLD W VEHICLE 
 2. RANGE VEHICLE V 
 
 HOLD W (V.V#, V.VNAME): V.VNAME = 'CODD' 
 
22 
 
 Only one relation or tuple variable is permitted in the 
 target list, so that the subsequent UPDATE can be defined. 
 The relation is placed in a protected state. In addition, 
 that relation's key must be included in the list of attributes 
 given or implied in order that the UPDATE be possible. 
 
 *4. alter: Alteration ■* wname . aname <right side of 
 
 assignment statement in host 
 language > 
 ** ( (w name . state = ( "HJ ' , 'busy U'))V =» 
 
 wname . st ate ♦» 'busy U'; 
 (wname . value @ -» x; 
 
 x . aname <assignment> 
 
 > 
 
 , Error (5) 
 
 The host language itself provides alterations to work- 
 spaces, most typically by assignment statements, although 
 procedures could also be used. 
 
 Example . After the second HOLD above, we want to change the 
 name of the scout ship Codd . 
 
 W.VNAME = 'DOLPHIN' 
 
 *5. update: Update •* UPDATE wname 
 
 ** JL wname . state/= 'TU' =» Error (5) 
 
 len (wname . sources)/ = 1 =» Error (3) 
 
23 
 
 , /Subset ( (w name . sources (1)) .key, 
 
 wname_. targetlist @ 2) =» 
 Error (1) 
 , wname . state ♦- 'busy U 1 ; 
 R «- w name . sources (1) ; 
 (wname . value @ •*• x; 
 1 . . len (R. value) @ -» n; 
 
 if (x . (R . key) = (R , value (n) ) . (R . key) )Athen 
 (R. value (n) . (w name . targetlist@2)) . 4- 
 x. (w name . target list @ 2) 
 fi; 
 ii. 
 
 R. protected — F; 
 wname . state «- 'RT'; 
 
 2i 
 
 If the workspace is in the proper state, if only one 
 relation is associated with it, and if the relation's primary 
 key is included among the workspace's attributes, then the 
 relation is updated in the following way: If t is a tuple in 
 the relation whose values on key fields agree with values in 
 the key fields of some tuple s in the workspace, then t is 
 changed to agree with s. 
 
 Example . UPDATE W. 
 
 Tuple in W: (V# 
 
 Old tuple in VEHICLE: (V# 
 
 New tuple in VEHICLE: (V# 
 
 '509', VNAME: 'DOLPHIN') 
 '509',VCLASS: 'SCOUT' , VNAME: 'CODD') 
 *509',VCLASS: 'SCOUT' , VNAME: 'DOLPHIN') 
 
24 
 
 *6. release: Release ■+ RELEASE wname 
 
 ** if (w name . stat e/= (' TU ', 'RT'))Vthen Error(5) 
 
 else wname. state •*- 'RT'; 
 
 • - . . . . i . . ■ i ■ ■ 
 
 wname . sources .protected «- F; 
 
 fi; 
 
 A RELEASE puts the workspace back in a ready state with- 
 out performing an update (i.e., it aborts the update). 
 
 *7. put: Put ■> PUT wname rname^alist)? (ord+)? 
 
 ** if wname . state /= 'RT' then Error (5) 
 
 else wname . state — 'busy P 1 ; 
 
 (wname .value-*- Sort 1 (w name . value , 
 
 ord+2^ ) ? 
 (wname . value @ -*■ x; 
 
 x(^_alist) ? ^> rname . value 
 
 h. 
 
 fi; 
 
 Unlike the UPDATE, which merely changes tuples already 
 in a relation, a PUT adds tuples. However, the original 
 ALPHA report doesn't specify how the workspace itself is to 
 be defined and filled. 
 Examples . 1. PUT W COLONY 
 
 Insert all the tuples in W into the relation COLONY. 
 
 2. PUT W COLONY UP S# 
 Same, except insert tuples by increasing S#. 
 
 3. PUT Wl VEHICLE. (V#, VCLASS, VNAME) 
 Insert the V#, VCLASS, VNAME fields, in that order, of 
 
 the tuples in Wl into VEHICLE. 
 
25 
 
 *8. destroy: Destroy ■* delete drop 
 
 ALPHA provides two types of statement for destroying 
 data. In a DELETE a relation or workspace is emptied of its 
 contents but stays in the database; a DROP removes the data 
 item. 
 
 *8.1 delete: Delete ■*- DELETE (rname | tvar) 
 
 | DELETE (rname | tvar) j_ qexp 
 | DELETE wname 
 ** (rname tvar . range ) .value *■ ( ) ; 
 
 ** x +- ( rname . tvar | tvar ); R •*- x. range. value; 
 
 (R @ -» x. value; 
 
 if qexp then 
 
 (R / = x. value) . @ R 
 fi; 
 
 1l 
 
 ** <same as UPDATE, except that first line is 
 
 if (wname . stat e/=('TU' , 'RT')A ) then Error(5) 
 and the 7th line from the bottom is 
 R. value (n) «- nil > 
 
 *8.2 drop: Drop + DROP name 
 
 >'cVf 
 
 if name . type = 'tvar' then Remove (n ame , tvars ) fi; 
 if name . type = 'domain' then 
 
 if name . relations = ( ) then % -> name 
 
 else Error(2) 
 
 fi 
 else % ■+ name 
 fi; 
 
26 
 
 Domains cannot be dropped if they are still being used 
 by relations. 
 Examples . 1. DELETE Wl 
 
 The relation associated with Wl is "updated"; except 
 that tuples are removed instead of changed. 
 
 2. DELETE X 
 
 The relation associated with X is emptied. 
 
 3. DROP X 
 
 The tuple variable X is dropped. 
 
 4. DELETE PLANET 
 
 The relation PLANET is emptied. 
 
 5. DELETE PLANET : PLANET. S# =57 
 
 All PLANET tuples whose S# is 57 are eliminated. 
 
 6. DROP PLANET 
 
 The relation PLANET is removed from the database. 
 
 7. DROP S# 
 
 If no relation uses it, the domain S# is dropped. 
 
 Procedures 
 
 1. Error: [n $ 'Error: ' & 
 
 (n// = =» 'attribute not declared', 
 
 = 1 =* 'key not included in attribute list', 
 = 2 => 'domain still used in a relation', 
 = 3* 'attempt to change more or less 
 
 than 1 relation', 
 
 
27 
 
 = 4 =* 'relation protected 1 , 
 = 5 =* ' WS in wrong state ' 
 ) -» out ] 
 
 Rectify : [(A:ref A) $ (x:0, temp: ( ) ) $ 
 
 (A @ + x; x /= % =» 
 
 / (x = temp V) =» x -» temp) ; 
 
 A *- temp ] 
 Routine to eliminate redundant and null members of a list . 
 
 The business part of GET and HOLD statements. There are 
 two arguments: 'tlist', the list of target pairs generated 
 in 2.2.1, and ' lproc ' , the Boolean procedure generated 
 from a quantified expression in 2.4. 
 Retrieve: T (tlist , lproc) & env $ (targvars : ( ) ) $ 
 
 targvars «- Rectify (tlist @ 1) ; 
 
 free «- ( ) ; 
 
 lproc *- Quantfree (f ree, targvars, tvars , lproc) ; 
 
 (Cprod (targvars @ range)@.+ (targvars @ value); 
 lproc => 
 ((tlist@l)@(tlist@2)).:(tlist@2) 
 
 ) ] 
 
 The local variable 'targvars' is a list of all the distinct 
 
 tuple variables used in the retrieval target . The procedure 
 Quantfree assigns default quantifiers to free variables 
 that do not occur in the target list. 
 
28 
 
 4. Quantfree: V (free.targvars ,tvars ,lproc) & env $ 
 
 free*- Setdiff (free, targvars) ; 
 (tvars @ ■+■ x; In (x,free) =*■ 
 
 r «*- r & (lproc: if x. quant = 'SOME' then 
 [(x. range @+x. value; lproc)VJ 
 else 
 
 [(x. range @->-x. value; lproc) A J 
 fi) 
 
 )] 
 
 First, the list 'free' of free tuple variables is 
 
 adjusted by subtracting the target list variables. Then 
 the remaining free variables are assigned quantifiers, in 
 reverse order of their declaration. This is done by running 
 through 'tvars', and replacing 'lproc' by successively more 
 quantified versions . 
 
 5. Cprod: [A & env $ (x:0, y:0) $ 
 
 (len A // - *.<().) , 
 
 > = 1 => (hd A @ + x; 
 
 Cprod (tl A) @ + y; 
 (x,) & y 
 ) 
 
 >] 
 Cprod is a recursive procedure to form the Cartesian 
 
 product of a list of lists . 
 
29 
 
 6. Setdiff: [(A, B) $ (a:0) $ 
 
 (A @ - a; (a = B) V * nil, a) ] 
 The set difference A\B of two lists. 
 
 7. Display: <Procedure to print a relation or workspace in 
 
 table form> 
 
 Example 
 
 STAR 
 
 S# SNAME POSITION S CLASS 
 
 01 SUN 202020 G 
 
 10 RIGIL 191816 G 
 
 57 SIRIUS B 162722 F 
 
 8. Subset: [ (A, B) $ (B @ = A)VA ] 
 
 Boolean procedure to check if A is a subset of B. 
 
 9. Sortl: [(table: ref cable, ords) & env $ (ord: %, order: 
 
 %, field: %, table : %) $ 
 (len(ords) . . 1 @ -*■ ord; 
 
 orders ( ord(l)='UP* => '<' / '>' ); 
 
 field*- ord (2); 
 
 table <- Sort (table, order, field) 
 
 ) ] 
 
 Specialized recursive routine to sort a relational table 
 'table' according to an ordering list 'ords' of pairs of 
 the form ( 'UP' | 'DOWN' , aname) . 
 
 10. Sort: [ (table : ref table, order, field) & env $ 
 
 (A:Sort(tl table, order, field) , x: hd table) $ 
 
30 
 
 (len A // < - 1 => A, 
 
 > = 2 => (A. field ! order x.field)@A 
 &(x,)& 
 (A. field ! order x. field) /@A 
 
 )] 
 Recursive routine to sort a table 'table' of records on 
 
 key 'field' with order 'order' = '<' or '>'. 
 
 11. In: [(elt, list) $ (elt = list)v] 
 Boolean procedure to see if 'elt' is in 'list'. 
 
 12. Remove: [(elt, list:ref list) & env $ list «- (elt/ = list).@list] 
 Procedure to remove all occurrences of 'elt' from 'list*. 
 
 References 
 
 [6] Codd, E.F., "A Data Case Sublanguage Founded on the 
 
 Relational Calculus," Proc. 1971 ACM-SIGFIDET Workshop 
 on Data Description, Access, and Control , ACM (1972); 
 also IBM Research RJ893 (1971). 
 
 [7] Codd, E.F., "Seven Steps to Rendezvous with the Casual 
 User," IBM Research RJ1333 (1974). 
 
 [S] Date, C.J., An Introduction to Database Systems , Addison- 
 Wesley (1975). 
 [12] Martin, James, Computer Database Organization , Prentice- 
 Hall, Inc. (1975). 
 
VL 
 
 The VL language was designed at the University of Illi- 
 nois by R.S. Michalski, Richard Schubert and others [13] . 
 Although inspired by ALPHA, it differs in three important 
 ways . 
 
 1. VL is a sublanguage of a particular language, VL1, 
 which is designed to "extend the capabilities of 
 current information systems by including deductive 
 capabilities and introducing inductive capabilities" 
 [13] . The syntax of VL is completely defined in [I3] 
 by a form of extended BNF grammar. 
 
 2 . The conditions permitted in queries are simpler than 
 those of ALPHA, and in some ways more like English. 
 
 3. In contrast to ALPHA and most other query languages, 
 
 where links between attributes must be specified, in 
 
 VL they can be implicit. For example, the ALPHA 
 
 query 
 
 RANGE PLANET P 
 
 GET (STAR.SNAME): 3P(STAR.S# = P.S# A 
 
 P.PNAME = 'MARS') 
 
 can be rendered in VL as 
 
 GET (SNAME) : PNAME = MARS. 
 
 The implicit linkage facility means that queries are 
 
 more natural and less messy, especially for users unfamiliar 
 
 with mathematical logic. On the other hand, a big system 
 
32 
 
 could spend too much time searching for links. This is not 
 a great problem because VL is designed to be used with rela- 
 tively small databases, those which can be kept in core. A 
 more serious problem is the great potential for ambiguity. 
 A "link" between two attributes ("descriptors" in VL 
 terminology) is a sequence of joins of one, two or more 
 relations such that both attributes are included, and there 
 can be more than one way of forming the joins. 
 
 For example, suppose a database contains the relations 
 R(A,B), S(B,C) and T(C,A), with the following contents: 
 
 R 
 
 
 S 
 
 
 T 
 
 
 A 
 
 B 
 
 B 
 
 C 
 
 C 
 
 A 
 
 1 
 
 2 
 
 2 
 
 3 
 
 3 
 
 2 
 
 2 
 
 2 
 
 
 
 3 
 
 3 
 
 3 
 
 1 
 
 
 
 
 
 What is the result of the query 
 
 GET R.A: S.C = 3 ? 
 If we take the obvious join R*S, then we get R.A = 1 or 2. 
 If we take R*T and then T*S, the result is R.A = 2 or 3. If 
 we take R*S*T, we get R.A = 2 only. Finally, if any linkage 
 is permitted, then R.A = 1, 2 or 3. 
 
 The problem of ambiguity vs. complexity recalls some of 
 the problems involved in using the predicate calculus in 
 philosophy and mathematics . Symbolic logic was designed pre- 
 cisely to avoid the ambiguity and lack of precision of natural 
 language, but symbolic logic is cumbersome and very difficult 
 for non-mathematicians to understand. 
 
 This description of VL makes no attempt to mimic any 
 actual implementation of the language. (For example, the VL 
 
33 
 
 system described in [13] stores a great deal of information 
 with descriptor names.) Our purpose is to describe a user's 
 view of VL in terms of SIBYL without also describing the 
 invisible nuts and bolts . 
 
 Our version of VL does not completely coincide with [13] . 
 The grammar has been altered in places, either to make the 
 SIBYL description easier or to eliminate certain constructions 
 that seem semantically inappropriate (in declaring a relation, 
 descriptor names shouldn't have relational qualifications). 
 Descriptions are not provided for INDUCE and DEDUCE state- 
 ments, since their use is peculiar to the VL1 system, or for 
 the implicit linkage convention, because of its complexity. 
 There is some difference of terminology between ALPHA 
 and VL. "Relations" and "domains" or "attributes" in ALPHA 
 correspond to "relational tables" and "descriptors" in VL. 
 
 In contrase to ALPHA, we have provided almost no 
 error- checking facilities. 
 
 Data. There are no tuple variables in VL, workspaces are not 
 an explicit part of the language, and descriptors are not 
 declared separately from relations; therefore the only data 
 type is the relational table. In structure, our SIBYL rela- 
 tion for VL is almost identical to that for ALPHA. A rela- 
 tional table can be in one of four states, which indicate its 
 degree of permanence: 
 
 'tempi' - used within larger expression 
 *temp2' - saved only until end of session 
 'event' - table contains a single "event" tuple 
 'perm' - permanent. 
 
34 
 
 There are two special data structures used in our SIBYL 
 implementation: 'rtabs 1 , a list of all relational tables, 
 and 'chtabs', a list of tables created during a CHANGE 
 statement . 
 
 Basic Forms 
 
 * Desclist -*■ desc (^ desc)* 
 
 ** ^desc (j_ desc)* _J 
 
 * desc: Descriptor ■*■ dname | rtname^dname | dname OF rt name | function 
 ** (Uniqrel ( dname t ) , dname) ] Ortname^dname^ | Ortname^dname )_ | 
 
 (%, function) 
 One type of descriptor found in [l] is not implemented 
 here: the form rtname . . rtname . dname . Each full descriptor (as 
 opposed to a simple name) translates to a relation-descriptor 
 pair. The relation of the desc-pair is retrieved by the 
 procedure Table, and the descriptor name by Dname. 
 
 Examples . 1. V#, VCLASS, VNAME 
 
 2 . PNAME 
 
 3. STAR POSITION 
 
 4. POSITION OF STAR 
 
 5. MAX (STAR. POSITION) 
 
 * Dname list ■* dname (^dname)* 
 ** _(dname (^dname ) * A ) 
 
 * dname: Descriptor name -»■ name 
 
 * rtname: Relational table name ■+ name 
 
 * Event name ■*■ name 
 
 * Newname ■> name 
 
 * Function ■* funcname (desc (.desc)*) 
 
35 
 
 ** funcname ( (Table ( desc ) .value) @ Dname( desc) 
 
 ( , (Table( desc ) .value) @ Dname( desc))*) 
 These functions are similar to those of ALPHA and are 
 properly functions of sets rather than individual tuples. 
 
 * Funcname ■*■ AVG ** [A $ (A+)/len(A)] 
 
 [ MIN ** [A $ A A] 
 
 [ MAX ** [A $ AV] 
 
 [ SUM ** [A $ A +] 
 
 [ COUNT ** [A $ len A] 
 
 1 DOMAIN ** [A $ A] 
 
 We assume that SIBYL has been extended so that A and V 
 
 give the min and max of numbers , and that functions applied 
 
 to lists give lists, not functions of unions of those lists 
 
 Examples . 1. COUNT ( COLONY. CNAME) ( = 6 ) 
 
 2. MAX (MISSION. DATE, COLONY. S#) ( = (2245, 56,) ) 
 
 * Valuelist ■* value (^ value)* 
 ** Rvalue (^value) *;_) 
 
 * Value •* aexpr iset name ? 
 
 * aexpr: Arithmetic expression -*■ aterm| aexpr + aterm| aexpr -_ aterm 
 
 * Aterm -*■ af actor |aterm*afactor | aterm /_ af actor 
 
 * Af actor ■* constant |£aexpr)J z. af actor 
 
 * Constant ■*• <number or string> 
 
 Note: in VL, strings are not surrounded by quotes. 
 The difference between strings and names is given by the con- 
 text . 
 
 * Name -*■ <same as in ALPHA> . 
 
 * ? : Undefined value + % 
 
36 
 
 The character set of VL in [13] is the same as that of 
 ALPHA, and '#' is permitted within names. 
 
 Major Productions 
 
 *0. VL Statement -> create | retrieve | change | delete | save | 
 
 comment | induce | deduce | help 
 
 *1. Create -*■ define-ins |add-ins 
 
 *l.la Define-ins -► DEFINE (RT) ? 
 
 ( rt name_( dname 1 i s 1 1 )_ ( KEY : = dname list 2 )?) "' "END 
 ** ( (rtname : (descs : dname 1 is t]_ »_ 
 
 key ^(dnamelist2)? ' 
 
 status : 'perm' , 
 value : ( ) ) 
 ) .&(env,rtabs) . + (env.rtabs); ) + 
 In a relational table the key is optional. The key 
 fields must be an initial sublist of the list of descriptors. 
 In the VL implementation described in [l3], a relational table 
 is a set of tuples, the first tuple being the descriptor 
 list . The structure above agrees with that used in the last 
 chapter, and simplifies the SIBYL description. 
 
 The second form of definition is used to create "events", 
 or single tuples which do not yet belong to relational tables 
 (actually, events are formally a special kind of relation), 
 while the first creates empty tables. 
 
37 
 
 pi. lb Define-ins ■+ DEFINE EVENT 
 
 (eventname jjl (dname : =value ( , dname : =value)*) )+ 
 END 
 
 (dname (j_dname) * ,) , 
 
 ** ( (eventname: (descs 
 
 status 
 
 value 
 
 ' event ' , 
 
 (( dname rvalue (^dname^, 
 
 value) *) ,)) 
 
 ) .&(env,rtabs) . -» (env,rtabs); )+ 
 
 Examples . 1. DEFINE STAR (S#,SNAME, POSITION, SCLASS) KEY : =S# END 
 2. DEFINE RT PLANET (S#,P#,PNAME,PTYPE) KEY: =S#,P# END 
 3. DEFINE MISSION (S#,V#,DATE,MTYPE) 
 
 VEHICLE (V#,VCLASS,VNAME) KEY:=V# 
 END 
 4. DEFINE EVENT 
 
 S:=(S#: =01, SNAME:=SUN, POSITION: =202020, SCLASS :=G) 
 END 
 
 *1.2 Add-ins ■* ADD TO rtname Orow-cond)? tuplelist END 
 
 ADD eventname TO rtname O_row-cond)? 
 
 | ADD (v alue list) TO rtname Orow-cond)? 
 
 ** (env $ (buffer, insert, v) $ 
 
 buffer <- ( _( tuplelist @-»v; rtname . descs . : v) 
 
 eventname .value 
 
 (rtname . descs . :v aluelist ,) )j_ 
 
 insert «- len( rt name . value) ; 
 
 ( insert <- r ow-cond;) ? 
 
 rtname . value «- rtname . value (1 . . insert) 
 
 & buffer & 
 
 rt name . value (insert+1 . . len( 
 
 rt name . value)) ; 
 
 ); 
 
38 
 
 The variable 'buffer' stores the list of tuples to be 
 inserted, while 'insert' points to the insertion location. 
 Only the first form requires an END. 
 
 *1.2.1 Tuplelist + (£valuelisO) + 
 
 *1 .2.2 Row-cond ■> [ROW <_ (value | LAST )] 
 
 | [ROW > (value [ LAST )] 
 
 ** (value 1 1 en ( rt name . va lue ) ) ^JL 
 
 | (value | len( rtname . value) 
 
 Examples . 1. ADD TO PLANET 
 
 (01, 02, EARTH, E) 
 (01, 04, MARS, M) 
 (01, 06, TITAN, I) 
 (12, 01, GAGARIN, M) 
 (23, 02, PLATEAU, E) 
 (23, 05, KAMPEN, I) 
 (56, 01, CYON, R) 
 END 
 
 2. ADD S TO STAR 
 
 3. ADD S TO STAR : [ROW < l] 
 
 4. ADD (S 15, STARSHIP, SCORPIO) TO VEHICLE: [ROW > LAST] 
 
 *2. Retrieve + get | let 
 
 "2.1 Get ■+ GET (RT) ? (rtname y^) ? rtcond (order) ? 
 ** (env $ tempt ab $ 
 
 tempt ab «- r t condj_ 
 ( Sort 1 (temptab, order) ; ) ? 
 ( (rtname: temptab ) .&(env,rtabs) ,-»(env,rtabs) ; 
 
 rtname . status »• 'temp 2'; ) ? 
 Display ( t emp t ab) ; 
 ) 
 
39 
 
 *2.2 Let * LET (RT) ? rtnarae 2Z rtcond (order) ? 
 
 ** (rtname:rtcond ) . &(env,rtabs) . ->-(env,rtabs) ; 
 
 r t name . status *- t temp2 f ; 
 
 ( Sort 1 ( rtname, order) ; ) ? 
 
 A GET statement always prints the retrieved table but 
 
 need not save it. A LET always saves the retrieval, but 
 
 doesn't print it. As in ALPHA, an ordering on descriptors 
 
 can be specified. 
 
 E xamples . 1. GET RT TEMPI := STAR 
 
 ~ 2. GET TEMPI := STAR 
 
 3. GET TEMPI := STAR ORDER UP ON SNAME 
 
 4. GET STAR 
 
 The first two statements are identical; the permanent 
 table STAR is printed, and stored in the temporary table 
 TEMPI. In #3, the temporary copy of STAR (but not the 
 permanent version) is reordered by increasing star name. In 
 #4, the table is printed but not stored. 
 
 5. LET TEMPI := STAR 
 
 STAR is stored in TEMPI, but nothing is printed. 
 More complicated retrievals will be illustrated as the 
 relational table condition is described. 
 
 *2.3 rtcond: Relational table condition -*■ rt((: | WHERE ) condition)? 
 
 |iset 
 ** rt ( ;(env $ t $ 
 
 L?*- @ "*" t; condition => t) 
 
 i ) ? 
 
 |iset 
 
 The relational table condition actually carries out the 
 retrieval . It can be in one of two forms : one similar to the 
 
40 
 
 ALPHA retrieval; and the image set, which is described below 
 under *2.3.3. 
 
 An rtcond retrieves a temporary relational table, which 
 may itself be used in a nested retrieval. No tuple variables 
 or quantifiers are used or permitted; implicit linkages and 
 image sets can usually be employed instead. In fact, VL is 
 fully as powerful as ALPHA. 
 
 Semantically, the rtcond returns either a relational 
 table consisting of all tuples in the table rt which satisfy 
 'condition 1 , or an image set . 
 
 -''2.3.1 Rt + texpr ( _(dnamelist)_ )? |£desclist) 
 
 ** (descs: ( (texpr . descs ( ; dnamelist ) ? 
 
 Dname desclist ) ) , 
 key : % , 
 status: 'tempi ' , 
 value: (t expr (@ dnamelist) ? )_ 
 
 | (s & env $ (Join (Table desclist ) Q ■» s ; 
 s . Dname desclist )_i 
 i 
 1 
 The table expression 'texpr' is a union, difference or 
 intersection of compatible tables. It is projected onto the 
 fields in 'dnamelist'. If a descriptor list is used with no 
 explicit table in front of it, the join of the implied tables 
 is taken. 
 
41 
 
 Exa mples . 1. MISSION 
 
 2. MISSION (S#,DATE) 
 
 3. (STAR. SNAME, PLANET. PNAME) 
 
 4. (SNAME, PNAME) 
 
 *2.3.1.1 texpr: Table expression ■+ tterm| texpr+tterm| texpr^tterm 
 ** (t expr . value «- ( Union | Setdiff ) (t expr . value , 
 
 tter m. value)) 
 
 * Tterm ■+ tf actor | tterm * tf actor 
 
 ** (tter m. value *- Intersection( tter m. value ,t f actor. 
 
 value)) 
 
 * Tf actor -»■ rtname | O:tcond)_ 
 
 ** ( (R : rt name ) $ R. status «- 'tempi' ;R) | rt cond 
 Examples . Suppose CITY and COUNTRY are two relations with the 
 same attributes as COLONY. Then the following are valid 
 relational table expressions for use in an rtcond. 
 
 1. CITY + COLONY 
 
 2. CITY- COLONY (CNAME,DATE) 
 
 3. COLONY - CITY*NATION (S#) 
 
 4. COLONY + (CITY: [CITY. DATE > 2100]) 
 
 *2.3.2 Condition ■* vl-terrn (OR vl-term)* 
 
 ** vl-term (V vl-term)* 
 
 * Vl-term ■* selector (selector)* 
 ** selector (A selector)* 
 
 * Selector -*■ [desc relop valuelist] 
 
 [rtcond -^ relop rtcond 2] 
 I [valuelist (NOT) ? IN rtcond] 
 
 ** (L ink (t , desc ) @ (! relop valuelist ! (if relop ='N0T=' 
 
 then ' A ' 
 else 'V fi))V 
 
42 
 
 [ RtopC rtcond-, , relop^ rtcond2)_ 
 | (/_) ? In(valuelist , rtcond )A ) 
 Relop + <_ | < = | ^ | NOT = | _^_= | 
 
 A condition is a logical expression in disjunctive normal 
 form. The simplest conditions, or "selectors", can be in one 
 of three forms: (1) A test to see if a descriptor bears a 
 certain relationship to each of several values. (2) A test 
 for subset or equality relationship between two relational 
 tables. Here '<' means 'subset'. The tables can themselves 
 be complicated expressions. In practice, they are often image 
 sets. (3) A test to see if each member of a list is or is 
 not in a given table. 
 
 Examples . 1. [SCLASS = G] 
 
 2. MISSION. DATE > 2100] 
 
 3. PTYPE = E, M] 
 
 4. "(STAR. S#: SCLASS) > = PLANET (S#)] 
 
 5. 02 IN PLANET (S#)] 
 
 6. "A,B NOT IN (SCLASS)] 
 
 7. [SCLASS = G] OR [SCLASS = K] [PTYPE = E,M] . 
 
 : 2.3.3 iset 
 
 Image set ■* (desc. ( : | WHERE ) (desc2 1 desc9 = value2) 
 
 (^_(desc | desc = value))" 2. 
 (Table ( desc- ^ (g ->• x ; 
 
 J_n( (t .Dname( desc2) [value 9) ,Link(x, desc?) ) 
 (A In( (t .Dname(desc ) [value ) ,Link(x, desc )) )i 
 
 =» x . Dname ( de scj) 
 
 ); 
 
 The image set retrieves all values of desc-^ which satisfy 
 a list of two kinds of condition: (1) a join linkage can be 
 
43 
 
 found to a value of desc2 which equals the given value, or 
 (2) a join linkage can be found to a descriptor being retrieved. 
 Thus in the second case image sets can have free variables . 
 Such an image set cannot be a relational table condition; it 
 makes no sense to say GET (MISSION. V#:M1SSI0N. DATE) . 
 Examples . 1. GET (STAR. SNAME: STAR. SCLASS = M) 
 Get all star names whose class is M. 
 
 2. GET (V#): [(MISSION. S#:V#) > = 
 
 (MISSION. S#:VNAME = CODD)] 
 Find the numbers of all vehicles which have visited the 
 same stars as the Codd . 
 
 3. LET T:= (PLANET . PNAME : PTYPE = E, SCLASS = K) 
 Find the names of all E type planets around K-class stars, 
 
 and store in T. This query could also be rendered 
 
 LET T:= PLANET(PNAME) : [PTYPE = E] [SCLASS = K] . 
 
 *2.4 Order -> ORDER (UP [ DOWN ) ON desclist 
 
 ** ( ('UP' | ' DOWN ') ,) & Dname desclist 
 The ordering statement and its SIBYL implementation are 
 the same as in ALPHA, except that only one direction of 
 ordering is permitted. 
 
 *3. Change ■+ CHANGE (RT) ? (rtname -) ? rtcond-,^ 
 
 (_^(rtname := ) ? rtcond)* 
 (change- stmt )* end- change 
 ** chtabs ♦- ( rtcond-. &( source : <unique table in rtcond^>)_ 
 (jjrtcond & (source: <unique table in rtcond>)* J 
 ); 
 
44 
 
 Ortname, j_ rtcond-.)_ (& (rtname : rtcond) )* 
 .& (env,rtabs). ■» (env, rtabs) ; 
 
 ( chang e - s tmt ) * 
 
 end -change 
 The rtconds used must extract tuples from only one 
 existing table. The key fields are not used to implement the 
 change, nor is there any special protection for them. The 
 list 'chtabs' stores the retrieved temporary tables. Altera- 
 tions are made by change- statements to the temporary tables; 
 the permanent tables are updated with an END. Each member 
 of 'chtabs' has an extra field 'source' to record the rela- 
 tion that is to be updated. 
 
 "3.1 Change-stmt ■* dname y^ value (j_ condition)? 
 
 | dname : = newname : [ROW = 0] 
 [ GET rtcond (order) ? 
 | DISPLAY rtname 
 (chtabs @ ->• R; 
 
 if In ( dname , R.descs) then 
 
 (R. value @ •» u; (condition =» ) ? 
 u . dname «- (u . dname , value )_ 
 ) -> R. value 
 
 fi; 
 
 ii 
 
 (chtabs @ -> R; 
 
 if In ( dname , R.descs) then 
 
 Substitute ((dname , newname) , dname , R.descs) ; 
 
45 
 
 (R. value @ -> u; 
 
 u — Subs ti tut e ( newname , dname , dom u) . :ran u 
 ) -*• R. value 
 fi; 
 
 h. 
 
 <retrieve as in GET> 
 Display ( rtname) ; 
 Changes to be made are stored as pairs of old-new values, 
 which replace the old value . There are two basic types of 
 change: change of value of a descriptor, and change of a 
 descriptor name. In each case, all tables in 'chtabs' are 
 affected. The CHANGE statement cannot be used to add tuples 
 to a relation. 
 
 -3.2 End-change -> ABORT | END 
 
 ** chtabs <-( ); | Update ; chtabs *- ( ) ; 
 
 Examples. 1. CHANGE Tl := PLANET, COLONY 
 
 DISPLAY Tl 
 
 P# := 03 : [PNAME = PLATEAU] 
 
 GET (PLANET. P#, COLONY. P#): [PNAME = PLATEAU] 
 END 
 
 The number of Plateau is changed to 03. The DISPLAY 
 
 prints the retrieved table Tl, while the GET prints the P# 02 
 
 retrieved from old, unchanged tables. The END updates those 
 
 tables . 
 
 2. CHANGE STAR 
 
 SCLASS := STYPE [ROW = 0] 
 ABORT 
 
 The user changes the SCLASS field name to STYPE, but then 
 
 changes his mind. The required condition [ROW = 0] reflects 
 
46 
 
 the fact that descriptor names are stored in the Oth row of a 
 table in the standard VL implementation. This condition 
 syntactically distinguishes the first two kinds of change 
 statement . 
 
 *4. Delete + DELETE (RT) dcond (^dcond)* 
 ** dcondj_ (dcond j_)* 
 
 -4.1 Dcond ■♦ rtname (£dnamelist)_ ? ((: | WHERE ) condition)? 
 | rtname 
 ** (rtname . value @ -> t ; 
 
 if T (A condition) ? then 
 t (. dnamelist^)? ■*- nil 
 else t 
 fi 
 ) ->• rtname .value; 
 | rtname +- %; 
 The DELETE statement erases fields of tuples or whole 
 tuples from a table, or, in its simplest form, destroys the 
 table. 
 Examples . 1. DELETE MISSION 
 
 The relation MISSION is destroyed. 
 
 2. DELETE MISSION (MTYPE) 
 
 Remove the descriptor MTYPE from MISSION. 
 
 3. DELETE STAR: [S# =11], COLONY 
 
 Delete star #11 from STAR, and destroy the COLONY table. 
 
 4. DELETE STAR (S CLASS ) : [S# = 11] 
 Erase the class of star #11. 
 
47 
 
 In [13], arbitrary rtconds are permitted in the delete 
 list . 
 
 *5. Save + SAVE rtname 
 
 ** rtname .status *- 'perm'; 
 A temporary table is made permanent . 
 Example . SAVE TEMPI. 
 
 *6. Comment -> COMMENT string END 
 
 *7 . Deduce 
 
 *8 . Induce 
 
 These last two statements are an essential part of the 
 VL1 system, but are highly specialized features of the VL 
 language . Since they have no counterpart in other query 
 languages and are not fully described in [l3],we have chosen 
 to leave them out . 
 
 *9 . Help -> HELP nonterminal or instruction . 
 
 The nonterminal is one which occurs in the original VL 
 
 grammar of [13] . The result is a printout of a production 
 
 rule or description. 
 
 Examples . 1. HELP GET 
 
 2. HELP <RTCOND> 
 
 Procedures 
 
 1. Sortl: <same as ALPHA Procedure 9>. 
 
 2. Display : <same as ALPHA Procedure 7>. 
 
48 
 
 3. Table: [desc $ desc(l)] 
 
 4. Dname: [desc $ desc (2)] 
 
 The last two procedures extract the associated table and 
 descriptor names from a descriptor pair. 
 
 5. Uniqrel: [dname & env $ (rel:0) $ 
 
 (rtabs ■* R; 
 
 ( R.descs == dname then 
 
 ( rel = => rel «- R , 
 
 rel/= (0,7oM => rel «- % ) 
 
 ) 
 ) 
 
 rel] 
 The single argument dname is a unary list. Uniqrel 
 returns % if dname is associated with more than one relation. 
 
 6. Join: [tlist & env $ J $ 
 
 (len tlist // < = 1 =*tlist(l), 
 > = 2 => 
 ((A:hd tlist ,B:Join(tl tlist))& env $(x > y,d,J)$ 
 d *- Intersection (A.descs, B.descs) 
 J-*- (descs: Union (A.descs , B.descs) , key:7o, 
 
 status: 'tempi', value: ( ) ); 
 (A @ ^x; B (§ -y y; 
 
 if (x.d = y.d) then 
 J. value <*- x & y 
 fi 
 
 );J 
 
 )) 
 
 ] 
 
49 
 
 The procedure Join recursively computes the join of a 
 list of relational tables. 
 
 [(A,B)& env $ Rectify (A & B)] 
 [(A,B)& env $ x $(A @ + x; In(x,B) => x)] 
 [(A,B)& env $ x $(A @ - x;/In(x,B) => x)] 
 [(x,A) $ y $ (A @ - y; x == y)v] 
 [(x,y,A:ref A) $ a $(A @ ■+ a; a==y «* x,a)] 
 [(A:ref a) $ (x:%, B: ( ) ) $ 
 (A @ - x; x /= % => 
 
 /In (x,B) =* (B«-x) ); 
 A ■«- B 
 
 7. 
 
 Union 
 
 8. 
 
 Intersection 
 
 9. 
 
 Setdiff 
 
 10. 
 
 In 
 
 11. 
 
 Substitute 
 
 12. 
 
 Rectify 
 
 13. Subset 
 
 ] 
 
 [(A,B) & env $ x $ 
 
 (A @ + x; In(x, B))A 
 
 ] 
 
 We assume that '==' is defined to mean 'identically 
 
 equal'. By using this operator, the above set operations 
 will work properly for any kind of list . In ALPHA we needed 
 set operations only on lists of primitives. 
 
 14. Rtop: 
 
 [(A, op, B) & env $ 
 
 (op // = '<' =» Subset (A,B) A /Subset (B,A), 
 
 = '<=' =* Subset (A,B) 
 
 = ' = '=» Subset (A,B) A Subset (B,A) , 
 
 ='/=' =» /Subset (A,B) V /Subset (B,A), 
 
50 
 
 = '>=' =* Subset (B,A) 
 
 = '>' => Subset (B,A)A/ Subset (A,B) 
 
 )] 
 The procedure 'Rtop' converts set expressions with rela- 
 tion operators into set relationships. 
 
 15. Update: [(chtabs @ ■* R; 
 
 C 'first check if any descriptors have been changed' 
 (1. .len R.descs @ ■> n; R.descs (n) ■+ d; 
 if d = list then 
 
 Substitute (d(2),d(l), R. source. descs) ; 
 (R. source, value @ ■*■ x; 
 
 x *- Substitute (d(2) ,d(l) , dom x) . : ran x 
 ) ■*■ R. source .value; 
 C 'replace d by its 2nd entry' 
 
 R. descs (n) <- d(2) 
 fi 
 
 ); 
 
 C 'update values in R. source' 
 (R. value @ ■+ x; 
 R. source, value @ ■*■ y; 
 R.descs @ ■* d; 
 
 if x.d = list A x.d(l) = y.d then 
 
 y.d *- x .d(2) 
 fi 
 ) 
 ] 
 
51 
 
 16. Link: [(x,d) & env $ 
 
 if In (Dname(d), dom x) then 
 
 (x.Dname(d) , ) 
 else 
 
 <form links > 
 fi 
 
 ] 
 If x is a tuple and d a descriptor, then Link(x,d) is 
 
 the set of all values of d for which join-links can be con- 
 structed from the entire tuple x. 
 
 Reference 
 
 [13] Schubert, R. N., "The VL Relational Data Sublanguage for 
 an Inferrential Computer Consultant," Univ. of Illi- 
 nois Dept. of Computer Science UIUC DCS-R-77-846 
 (1977). 
 
SQUARE 
 
 The SQUARE sublanguage (Specifying Queries As Relational 
 Expressions) reflects a different philosophy on query 
 language design: that queries should resemble as far as 
 possible the way people use tables . It was developed after 
 ALPHA but before VL [2 , 3] - Like VL, it is less cumbersome 
 than the predicate calculus of ALPHA, but unlike VL it makes 
 some use of free variables and none of implicit linkages. 
 Among its noteworthy characteristics may be listed the 
 following: 
 
 1. SQUARE heavily uses "relational expressions", which 
 may be the objects of queries or which may be 
 embedded in more complicated expressions . 
 
 2. The most common building block in relational expres- 
 sions is the "mapping", which mimics human usage of 
 a table. 
 
 3. As with VL, there are no provisions for file 
 protection. 
 
 4. The original SQUARE uses an unusual two-dimensional 
 syntax for easy exposition, which is reproduced here. 
 This feature is discussed in [3] . A more conven- 
 tional syntax has been developed into the closely 
 related language SEQUEL [4] . 
 
 Let us examine a typical mapping. Suppose we want to 
 
53 
 
 retrieve the names of all planets around star number 01. 
 This is rendered 
 
 1. PLANET (01) 
 PNAME S# 
 
 If the relation PLANET is imagined in table form, each domain 
 
 is a column of the table; in fact, "column" is the SQUARE 
 
 terminology [3] for "attribute" or "domain" in ALPHA or 
 
 "descriptor" in VL. To perform (1) manually, one looks up 
 
 all 01 entries in the S# column ("source" column) , then 
 
 retrieves the corresponding entries under PNAME (the "target" 
 
 column) . 
 
 For a slightly more complex retrieval, suppose we want 
 
 the numbers and names of all colonies around star 56 which 
 
 were founded after 2300: 
 
 2. COLONY (56, > 2300) 
 C#, CNAME S#, CDATE 
 
 Each source must match its corresponding argument. The 
 
 comparison operator in the second argument illustrates the 
 
 tendency in SQUARE to put a lot of information in argument 
 
 lists. 
 
 Our syntax for SQUARE is based on that given in [3] . 
 There is no error checking. 
 
 Data . Relational tables, or sets, and variables are the 
 only two data types directly involved in handling relations . 
 Neither require system information fields, so a permanent 
 relation is just a list of records. 
 
 Formally, variables fall into two classes: "free vari- 
 ables", analogous to ALPHA tuple variables, and "computed 
 
54 
 
 variables", which have values assigned to t) ; however, 
 there is no nee' to make a sharp distinction .^tween them. 
 
 Although i ianent relations consist of records 
 ("tuples"), a "set" or "relational expression", which is what 
 a mapping or other query retrieves, is a list of lists with 
 column names removed. For example, if R(A,B) presently con- 
 tains the tuples (A:0, B:15) and (A: 3, B:76), then the 
 expression A, B has the value ((0, 15,), (3, 76,),). To 
 keep track of column names we use the special variable 
 'columns'. When the expression tuples are assigned to a 
 computed variable, the fields of 'columns' are added to make 
 records. Of course, it would be possible to define the 
 SIBYL translation of SQUARE so that tuples are always 
 records . 
 
 Note that "sets" are lists of tuples, not lists of 
 primitive values. 
 
 The symbols + and 4- in the grammar indicate whether 
 following terminals are to be written on, respectively, 
 the upper or lower line . 
 
 Basic Forms 
 
 *aexpr: Arithmetic expression ■+ aterm aexpr + aterm aexpr aterm 
 
 * Aterm ■+ afactor aterm *_ afactor aterm /_ afactor 
 
 ** aterm * afactor 
 
 * Afactor -*■ const fexpr _(aexpr)_ afactor 
 
 
55 
 
 *fexpr: Function expression ■*■ t fnarae_(sexpr)_ **fname ^sexpr @ 1) 
 
 * Fname ■+ AVG ! MAX | MIN | SUM | COUNT 
 
 ** <same as VL functions > 
 
 Example. COUNT ( PLANET (56) ) 
 P# S# 
 
 The number of planets around star 56. 
 *acop: Arithmetic comparison operator "** = I *M £ I £ I i I 5. 
 
 ** 
 
 /= 
 
 <= 
 
 >= 
 
 Column list + cname (^ cname)* **j(cname^, cname V*^) 
 Column name ■*■ name 
 Computed variable -* name 
 Variable name -*■ name 
 Relation name -»■ name 
 
 * Name ■*■ <standard name, but see below> 
 Neither [2] nor [3] have '#' in names, but we permit it in 
 order to make use of the example database. Traditionally, 
 SQUARE variable names are lower case [3,11] . 
 
 * collist 
 
 * cname: 
 
 * compvar 
 
 * vname: 
 
 * rname: 
 
 tlist 
 
 clist 
 
 Tuple list ■+ tuple Ctuple)* 
 
 Tuple •*■ <clist> 
 Constant list -* const (, const)* 
 
 **(tuple (j_tuple) * x ) 
 
 ** clist 
 
 ** ( const ( , cons t ) * ) 
 
 /( const: Constant ■+ num|str 
 
 * num: Number -> <standard number> 
 
 * str: 
 
 * ul 
 
 * + 
 
 String ■+ <standard string, in quotes> 
 
 Underline -*■ <underline next name> 
 
 Upper line -*■ <write following characters on upper line> 
 
56 
 
 * + : Lower line ■> <write on lower line> 
 
 The SQUARE character set of [2] and [3] has many more 
 special symbols than ALPHA or VL. Examples: lower case 
 letters, n, u, 0, t, + , <-, +. 
 
 Set Expressions 
 
 "0. sexpr: Set expression ■* sterm sexpr U sterm sexpr 2. sterm 
 ** sterm Union( sexpr, sterm)_ 
 
 Setdiff ( sexpr, sterm)^ 
 
 * Sterm -*■ sfactor sterm J\_ sfactor 
 
 ** sfactor Intersection ( sterm, sfactor^ 
 
 * Sfactor -* rexpr | (sexpr )_ 
 
 * rexpr: Relational expression ■*■ [tlist] + compvar (i 
 
 collist) ? proj mapexpr 
 *' V columns •<-(); tlist ( ) 
 
 columns + collist \_ compvar ( @ 
 
 collist) ? 
 proj mapexpr 
 
 As mentioned before, the set expression is the backbone of 
 SQUARE . 
 Exampl es . 1. [< 'P078*, 'PROBE*, 'PI0NEER'>, <'P323\ 'PROBE'," >] 
 
 A "constant" relational expression, consisting of two 
 tuples, perhaps to be used in an insertion. 
 
 2. 
 
 3. x 
 
 SNAME, SCLASS 
 
57 
 
 4 . STAR 
 SNAME, SCLASS 
 
 Ex. (3) and (4) give the same result if the table STAR 
 
 has been stored in x. (4) is the projection of STAR onto two 
 
 of its columns . 
 
 5. VEHICLE ('STARSHIP') 
 V# VCLASS 
 
 6.x e VEHICLE: x = 'STARSHIP' 
 V# VCLASS 
 
 (5) is similar to mapping (1) discussed in the introduc- 
 tion to this chapter. As a query, (6) is the same as (5), but is 
 expressed with a free variable. In the syntax of [3] (and in 
 ours) , expressions with free variables may be used only as 
 queries, and not as part of more complicated expressions. 
 Thus (6) is not a "relational expression", although the 
 syntax could be extended to make it one. 
 
 7. STAR u PLANET- 
 SNAME PNAME 
 
 A set expression, the union of two relational expres- 
 sions . 
 
 *0.1 pro j : Projection ■> 4- trglist f rname 
 
 ** domains -«- trglist j_ rname @ trglist 
 
 This is the simplest way of retrieving from a table. 
 
 To retrieve all of STAR we must list all its columns: 
 
 STAR 
 S#, SNAME, POSITION, SCLASS 
 
 *0.2 mapexpr: Mapping expression ■+ mapping (o mapping)'' + _(arglist)^ 
 
 ** mapping (mapping)* £arglist2. 
 
58 
 
 "0.2.1 mapping: Mapping ■* + trglist + rname ( J.) + srclist 
 
 ** [ args & env $ 
 
 domains *■ trglist j_ 
 ( Rectify | ) _(rtname @ -*■ x ; 
 
 Agree ( trglist, rname , x, Cnames ( srclist) 
 Type ( srclist ) , arg s ) 
 =» x. trglist 
 I 
 
 "0.2.1.1 trglist: Target list ■+ collist 
 
 "0.2.1.2 srclist: Source list ■* srccol (^srccol)* 
 
 ** ^srccol (, srccol)*, ) 
 *0. 2. 1.2.1 srccol: Source column ■* cname ul cname 
 
 ** _( ( 'dis ' 'con' ) _^ cname )_ 
 "0.2.2 arglist: Argument list ■»■ arg ( x arg)* 
 
 -" Urg (j_arg)* mL 
 "0.2.2.1 arg: Argument •+ ( acop) aexpr sexpr 
 
 ** _^ ( = acop) aexpr _^ sexpr 
 
 The mapping is the commonest SQUARE retrieval. 
 
 Examples . 1. VEHICLE ('STARSHIP') 
 
 V# VCLASS 
 
 The set of numbers of all vehicles of STARSHIP class. 
 
 2. COLONY (56, <2300) 
 
 C#, CNAME S#, CDATE 
 
 The numbers and names of all colonies around star 56 
 
 founded before 2300; the result is (( 58, 'PRIME',),). Here 
 
 C#, CNAME is the "target list" of the mapping, the list of 
 
59 
 
 columns to be retrieved. Next is the table COLONY. Then 
 come the "source list" S#, CDATE and the "argument list" 
 56, <2300, which are matched. 
 
 This last part of the mapping can become quite compli- 
 cated, as is shown by the intricacy of the procedure Agree, 
 which in our implementation performs the actual comparisons. 
 Arguments may be simple values or arithmetic expressions, 
 values with built in comparison operators, or complete set 
 expressions. In the latter two cases, there is defined to 
 be a match if the source columns of a tuple match any value 
 of the set; this is known as a "disjunctive mapping". 
 
 3. COLONY (^2300, 
 CNAME DATE, S# 
 
 STAR ('PROXIMA')) 
 
 S# SNAME 
 
 The names of all colonies around Proxima which were 
 founded in or after 2300. If the date requirement is 
 dropped, the new expression can be written as the composite 
 of two mappings : 
 
 4. COLONY o STAR ('PROXIMA') 
 
 CNAME S# S# SNAME 
 
 Larger tuples may be used in sets in the argument list, 
 in which case source columns are compared simultaneously: 
 
 5. COLONY (>2005, 
 CNAME CDATE, S#, P# 
 
 PLANET ('MARS')) 
 S#, P# PNAME 
 
 The names of all colonies on Mars founded after 2005 
 
 (result = ( (BELLONA , ) , ) ) . Both star and planet numbers must 
 
 be specified. Note that LANG matches in the S# column but not 
 
 the P# column, and is therefore not retrieved. 
 
60 
 
 In a composition the target and source lists need not 
 have the same names . This is convenient , since MDATE and 
 CDATE are different. 
 
 6. COLONY o MISSION ('S12') 
 CNAME S#, CDATE S#, MDATE V# 
 
 The names of colonies founded by vehicle S12. 
 
 One can also require that source column entries match 
 all tuples of a set. This is called a "conjunctive" mapping, 
 and is indicated by underlining the column name. 
 
 7. MISSION ( STAR ('A')) 
 V# S# S# CLASS 
 
 The numbers of vehicles which have visited all A-class 
 stars. More precisely, if m is a tuple in MISSION, then 
 (7) retrieves m.V# if and only if the argument set above is 
 included in the set [x.S# : x.V# = m.V#] of S# which corres- 
 pond to m.V#. 
 
 In other words, if S is a set of tuples, the query 
 
 R (S) 
 
 B A 
 
 is equivalent to 
 
 x <i R : S 5 R (x ) . 
 B ABB 
 
 Conjunctive and disjunctive columns can be combined. 
 
 Let D be a set of dates and S a set of star numbers. 
 
 8. MISSION (D, S) 
 V# MDATE, S# 
 
 The numbers of vehicles for which there is a single year 
 
 in D in which the vehicle visits all stars in S. 
 
 9. MISSION (S, D) 
 V# S#, MDATE 
 
61 
 
 The numbers of vehicles that have visited all stars in S 
 at some time in D, not necessarily all at the same time. 
 Note the difference between (8), which is more restrictive, 
 and (9) . In [2] and [3] only mappings of type (8) are pos- 
 sible; those written like (9) are rearranged semantically to 
 mean the same as (8). This seems unnecessarily restrictive. 
 
 Finally, we can request that redundant elements not be 
 
 eliminated. 
 
 10. GET COUNT ( MISSION' (2210)) 
 
 V# MDATE 
 
 The value is 2, not 1. We want the number of missions 
 
 run in 2210; but both were by S09. The expression (10) 
 
 retrieves ((S09,) , (S09,) ,) instead of ((S09,),) because of 
 
 the ' after the relation name. 
 
 Statements 
 
 * SQUARE transaction -*■ (statement 0* 
 
 * Statement -»■ query asgn create insert delete 
 
 destroy update 
 
 *1. query: Query ■* t ( GET ) ? (sexpr aexpr varlist ( + j_ 
 
 test) ? ) 
 ** temp ■*• ( sexpr aexpr | 
 
 (domains *■ Cols (v arlist) ; 
 Cprod (Rels (v arlist )) @ 
 
 . ->• Freevars (v arlist) ± ( test =» ) ? 
 
 Vars (varlist) @ Cols (varlist) 
 
62 
 
 1 
 
 )i 
 
 Display (t emp, domains ) ; 
 
 The GET retrieves an expression and displays it without 
 
 assigning it to anyone. 
 
 Examples . 1. GET VEHICLE ("STARSHIP') 
 
 V# VCLASS 
 
 2. VEHICLE ('STARSHIP') 
 V# VCLASS 
 
 3. v +■ VEHICLE ('STARSHIP'); 
 
 V# VCLASS 
 GET v 
 
 4. GET x 6 VEHICLE: x = 'STARSHIP'. 
 
 V# VCLASS 
 
 5. GET s : s + COUNT ( VEHICLE ('STARSHIP')) 
 
 V# VCLASS 
 
 The first four are all equivalent . Example (4) is 
 reminiscent of ALPHA. In the syntax of [3] , such expressions 
 can only be used as the object of a query. Example (10) 
 under *0.2 shows an arithmetic expression in a GET. 
 
 Free variables are those used in ALPHA-like targetlists. 
 Computed variables have values assigned to them, just as in 
 common procedural languages. In the examples above, v and s 
 are computed, while x is free. 
 
 *1.1 varlist: Variable list + var (j_var)* 
 
 ** Cvar (, var)*, ) 
 * var: Variable ■+ freevar compvar 
 
 ** freevar (dom compvar @ ->- d; 
 
 jtcompvar^ d , ( ) ) 
 ) 
 
63 
 
 * 
 
 freevar: Free variable ■*■ + vname _£_ rname i vname + 
 
 collist 4 (_£_) ? rname 
 ** (. ( dom vname (1) | collist) @ ■» d; 
 
 _Cvname^_ d^ rname^ 
 1 
 compvar: Computed variable ■*• t vname 
 
 *1.2 
 
 test 
 
 Test ■*• lterm test V lterm 
 
 * Lterm ■*- lfactor | lterm A lfactor 
 
 * Lfactor -*■ arithcomp modcomp setcomp asgn 
 
 j(test)_ | 2l lfactor 
 ** | / lfactor 
 
 *1.2.1 arithcomp: Arithmetic comparison ■+ t aexpr + acop t aexpr 
 *1.2.2 modcomp: Modified comparison -»■ + aexpr t acop + mod t sexpr 
 
 t sexpr t mod + acop f aexpr 
 + sexpr -^ t modi + acop + mod^ + 
 
 sexpr« 
 ** _(aexpr acop J^sexpr @ 1) ) mod 
 1 ((sexpr @ 1) acop aexpr), mod 
 1 ( (sexpr ^ @ 1) @ acop ^sexpr^ @ 1)) mod, mod 2 
 Modifier ■* ALL SOME 
 ** 
 
 *mod 
 
 A 
 
 V 
 
 *1.2.3 setcomp: Set comparison -*- t sexpr-, + scop + scop^ 
 
 ,Wc Setcomp ( sexpr, , 'scop' , scop«)_ 
 
 *scop: Set comparison operator -*■ = £ Z* z £ Z 
 ** = | /= | > | >= | < | 
 
 A test is a boolean expression containing a free variable; 
 it has the same role as the ALPHA "quantified expression" or 
 
 <= 
 
64 
 
 "predicate". Numbers or sets can be compared. Assignments 
 may be embedded in tests, where they return a value of "true". 
 
 One novelty which saves effort is the use of quantifiers 
 ("modifiers") around arithmetic operators. Since all sets 
 are finite, no ambiguity is possible when two quantifiers 
 are used. 
 
 Example . [<1>, <3>, <4>] ALL < SOME [<3>, <6>] 
 
 The value is T. 
 
 In [2] and [3] modified operators are defined in terms 
 of the MAX and MIN functions. 
 
 *2. asgn: Assignment ■*• + ( LET ) ? (rname (+ collist +) ? ♦ 
 
 query compvar ♦ aexpr) 
 ** ( (rname : %) % env -*• env ; 
 
 rname •«- ( query @ ■» q ; 
 
 domains (j_ collist) ? . : q 
 
 1 
 
 I compvar + aexpr 
 ) ; T 
 Assignment can be used either to create temporary rela- 
 tional tables, or to assign arithmetic values to computed 
 variables. The results are not printed. 
 Examples . 
 
 1. 
 
 LET TEMPTA3 * STAR 
 SNAME 
 
 2. 
 
 TEMPTAB * STAR 
 SNAME 
 
 3. 
 
 GET S STAR, x: S 
 
 = 'G' A 
 
 SNAME SCLASS 
 
 x +■ COUNT ( PLANET (s )) 
 P# S# S# 
 
65 
 
 Ex. (1) and (2) are equivalent. (? N retrieves the names 
 
 and number of recorded planets of G-clas:? stars. The result 
 
 is 
 
 SUN 3 
 RIGIL 1 
 TAU CETI 1 
 
 *3. create: Creation -»-++_ rname + cname ( M cnarne)* 
 
 ** (rname : ( (dn ame : % (j_ dname : %> ) * ) ) ) 
 
 & env -> env; 
 
 Example . + PLANET 
 
 S#, P#, PNAME, PTYPE 
 
 A permanent relational table is created. A dummy tuple 
 
 is inserted to save the column names . 
 
 *4. insert: Insertion ■*• + +_ rname + collist t Jjclist)^ 
 
 ** rname <+. collist ^j_ clist j_ 
 
 Example . 4- PLANET (01, 06, 'TITAN', 'I') 
 
 S#, P#, PNAME, PTYPE 
 
 *5. delete: Deletion •> f +_ rname I collist + _(clist)_ 
 
 ** Orname @ •» x; 
 
 if rname ^ collist ■*• clist then 
 Remove (x , rname)_ 
 
 fi; 
 
 I h. 
 
 Example . 1. t PLANET (01, 06, 'TITAN', 'I') 
 
 S#, P#, PNAME, PTYPE 
 
 2. t PLANET (01) 
 S# 
 
 Example (1) deletes the tuple that was inserted in the 
 
 last example, while (2) deletes the records of all planets 
 
66 
 around star 01. 
 
 *6. destroy: Destruction ■*■++, rname ** rname *■ %; 
 Example . + PLANET 
 
 The entire table PLANET is destroyed. 
 
 *7. update: Update -*■ t + rname 4- collist j_ ulist t J[arglist: x clist)_ 
 ** (1. .len rname @ -> n ; x *■ rname (n) ; 
 
 (x. collist @ ! arglist ) =» 
 (x & env $ (1. . len ulist @ -*• k; 
 
 rname (n) ! (u list k) clist (k) 
 ) 
 
 1 
 
 In an UPDATE the source list is divided into two parts. 
 Those tuples in the relation whose 'collist' entries match 
 the corresponding arguments are updated according to the 
 update list 'ulist' and its arguments. Arithmetic operators 
 may be put in the update list . 
 
 Examples . 1 . ■* VEHICLE ( ' CODD ' , ' DOLPHIN * ) . 
 
 VNAME; VNAME 
 
 Change the name of the Codd to Dolphin . 
 
 2. + PLANET (23, 1) 
 
 S#; P# + 
 
 Add 1 to the numbers of planets around star 23. 
 
 The syntax of [3] permits tuples in the update argument. 
 
67 
 
 Procedures 
 
 1. 
 
 Cprod 
 
 2. 
 
 Display 
 
 3. 
 
 In 
 
 4. 
 
 Intersection 
 
 5. 
 
 Rectify 
 
 6. 
 
 Remove 
 
 7. 
 
 Setcomp 
 
 8. 
 
 Setdiff 
 
 9. 
 
 Union 
 
 10. 
 
 Agree: [(T, 
 
 <same as ALPHA Procedure 5> 
 
 <print tuples with given header > 
 
 <VL Proc. 10> 
 
 <VL Proc. 8> 
 
 <VL Proc. 12> 
 
 [ x & (A: ref A) & env$A +■ Setdiff (A, (x,))] 
 
 <or ALPHA Proc 12 with '=«' instead of '='> 
 
 <VL Proc. 14, Rtop> 
 
 <VL Proc. 9> 
 
 <VL Proc. 10 > 
 
 [(T, R, x, S, type, args) & env $ 
 
 (a, m, n, s, y) .: (%, 0, 0, %, %) $ 
 (args @ ■* a; m.«- (mVn) + 1; 
 if a /= list then 
 
 x . A ! a 
 else 
 
 n +■ m + len a(l) ; s •*■ x . S (m . . n) ; 
 if type (m) = 'dis' then 
 
 In (d, a) 
 else 
 
 In (a, (R @ -> y; 
 
 y . T == x . T =* 
 y . S (m . . n) 
 
 ) ) 
 
 fi fi 
 
68 
 
 )A 
 
 ] 
 
 T = target list of mapping 
 
 R = relation 
 
 x = tuple being tested 
 
 S = column names in source list 
 type = types in source list ('dis' or 'con') t 
 args = argument list 
 
 \ 
 
 [srclist $ srclist @ 1] 
 [srclist $ srclist @ 2] 
 [varlist $ varlist @ 2] 
 [varlist $ varlist @ 3] 
 [varlist $ varlist @ 1] 
 [varlist $ (varlist @ ■> v; v(3)/= ( ) => v(l))] 
 
 11. 
 
 Type: 
 
 12. 
 
 Cnames : 
 
 13. 
 
 Cols: 
 
 14. 
 
 Rels: 
 
 15. 
 
 Vars : 
 
 16. 
 
 Freevars 
 
 V 
 
 References 
 
 [2] Boyce, R. F., D. D. Chamberlin, W. F. King, and M. M. 
 Hammer, "Specifying Queries as Relational Expres- 
 sions," Proc ACM S1GPLAN/S1GIR Interface Meeting 
 (1973), pp. 31-40. 
 
 [3] , "Specifying Queries as Relational 
 
 Expressions: the SQUARE data sublanguage," Proc 
 1974 ACM SIGF1DET Workshop , Ann Arbor, Mich. (1974), 
 pp. 249-264. 
 
69 
 
 [4] Chamberlin, D. D., and R. F. Boyce, "SEQUEL: A struc- 
 tured English query language," Comm ACM , vol. 18 
 (1975), pp. 621-8. 
 [ll] Lacroix, Michel, and Alain Pirotte, "Example queries in 
 relational languages," Technical Note N107, MBLE 
 Research Laboratory, Brussels, Belgium (1978). 
 
CONCLUSIONS 
 
 This report has used the semantic definition language 
 SIBYL to study three relational database query languages by 
 providing a syntax-directed translation into SIBYL for each 
 one. Writing it has made clear some of the problems, 
 advantages, and disadvantages of the languages, of the 
 translation method of semantic definition, and of SIBYL 
 itself. 
 
 The Languages . 
 
 1 . ALPHA . The entries ("targets") used in target lists 
 are of different types, whereas logically only tuple variables 
 are permissible. For example, we can have 
 (X.S#, Y, STAR, PLANET. P#) as a target list, where X and Y 
 are tuple variables, STAR and PLANET are relations, the outer 
 two targets are qualified by attribute names, and the inner 
 two are not . The advantage of this inelegance is that target 
 lists become more like natural language. 
 
 To use relation names as tuple variables is not consis- 
 tent, and requires extra run-time structure (each relation 
 has a built-in tuple variable). Again, this makes ALPHA more 
 English-like . 
 
 A more serious problem is the independent declaration 
 of tuple variables and the use of implicit quantification, 
 
71 
 
 as in 
 
 RANGE STAR X 
 
 GET W (PLANET. PNAME): X.SNAME = 'SUN' 
 
 X.S# = PLANET. S# 
 
 Not only must the implementation keep track of declared tuple 
 
 variables, but in translating a quantified expression, it 
 
 must find out which variables are implicitly quantified by 
 
 examining the predicate and target list for free variables. 
 
 To the extent that ALPHA resembles the predicate 
 
 calculus, it is simple, elegant and easy to translate. At 
 
 the same time it is clumsy and hard for untrained users to 
 
 understand. 
 
 2. VL. As mentioned in the VL chapter, implicit 
 linkage, which is VL's most distinctive feature, is both a 
 great advantage and a potentially serious problem: it makes 
 certain queries very simple, but adds the possibility of 
 semantic ambiguity. 
 
 A syntactic ambiguity occurs with image sets: (A:B) 
 is the set of all A corresponding to B, while (A:B=5) is 
 the set of all A such that B = 5. 
 
 Some users may be confused by the absence of quotes 
 around strings . 
 
 3 . SQUARE . This language's two-dimensional syntax is 
 not suited for computers, but alternative syntaxes are easy 
 to construct and have already been developed. This pecu- 
 liarity does make SQUARE very easy to read, which was in 
 fact the original intent . 
 
72 
 
 The source and arguments lists of mappings can become 
 quite complicated, and thereby account for most of the diffi- 
 cult features of our implementation. Perhaps some way of 
 breaking these structures into smaller and more manageable 
 pieces can be devised. On the other hand, these lists are 
 rather easy to read, and are fairly close to the mathematical 
 meaning if arguments are interpreted as defining sets. 
 
 The simple mapping B^A^S) is simple both to understand 
 and to implement. It mimics the way people use tables. 
 Mathematically, it can be written Tr^TT* _i (S) , where tt„ denotes 
 the projection of R onto the column (domain) list K. Thus 
 SQUARE seems based on functions rather than the logical 
 predicate calculus. 
 
 Translation. 
 
 Our method of syntax-directed translation into SIBYL ran 
 into four kinds of problems. 
 
 1. Peculiarities of SIBYL. These are not serious . For 
 example, a unary list in SIBYL must have a trailing comma, 
 so a typical list like 
 
 * Namelist -» _£ name ( _^ name) * )_ 
 must be translated 
 
 ** j[ name (j_ name) * , ) 
 
 2. Context sensitivity in the object language. Since 
 our grammar is context free, context sensitivity appears in 
 the semantics , where it must be handled by the SIBYL transla- 
 tion (for example, declaration and use of tuple variables in 
 
 
73 
 
 ALPHA, handled by run-time structures). This problem seems 
 fundament ally insoluble. 
 
 3 . Complexity of language structures (e.g., comparisons 
 involving sets in VL and SQUARE) or of the SIBYL implementa- 
 tion (the VL implicit linkage) . This is to a certain extent 
 unavoidable, since we often want languages to be complicated. 
 In many cases complexity is due to one of the other three 
 problems. When writing the SIBYL code, one must try to make 
 it as readable as possible. 
 
 4. Unsuitable, inconsistent, or inelegant syntax. 
 Some syntax problems can be easily solved by simply rewriting 
 the grammar (for example, using dnamelists instead of 
 desclists in VL relation declarations). The problems are 
 more serious when the syntax doesn't accurately reflect the 
 semantics at an important point (an example: the ALPHA 
 relation name, which can act either as a bona fide relation, 
 or as tuple variable taking values in that relation) . This 
 leads to complicated translations of the offending form, in 
 order that the form can be used in all of its ways . There 
 are several possible solutions: 
 
 (1) Break the form up; e.g., 'form' becomes ' forml' 
 and 'form2', with separate translations. 
 
 (2) Translate the form into enough structure to take 
 care of all uses , then perhaps apply procedures in each 
 separate case (e.g., the SQUARE ' srclist 1 ). This was our 
 usual strategy. 
 
74 
 
 (3) Permit more than one translation of a form. For 
 example , 
 
 * Name ■> 
 
 ** 1 
 
 ** 2 ; ; ; ; ; 
 
 Later we might have 
 
 * Widget ■* ... name . . . 
 
 ** . . . name**l . . . 
 
 and also 
 
 * Nubbin -* ... name . . 
 ... name /wx z 
 
 We did not use this method, but it seems promising 
 
 SIBYL 
 
 In general, SIBYL worked very well in semantic defini- 
 tion. It is powerful and flexible enough to translate most 
 object language features, and has good facilities for using 
 a brute force approach when all else fails (procedures and ! 
 before strings). Nevertheless, improvements can be made, to 
 which end we suggest the following extensions, some of which 
 are used in this report: 
 
 1. Define ~ to be a "super-equality" operator which 
 never distributes over lists; A==B is true if and only if A 
 and B have identical values . 
 
 2. Permit names to be a special data type, or otherwise 
 allow coercion of names to their values to be suppressed. 
 This is possible now by putting names in lists or procedure 
 brackets, but perhaps something less messy can be thought up. 
 
 3. Allow do . . . @ . . . od for loops of the form 
 
75 
 
 4. Add more record processing facilities. For example, 
 if R = (A: 10, B:20), then R(l) might be defined as (A: 10) . 
 
 5. Extend _V_ and _A to give the maximum and minimum of 
 two numeric values . 
 
 6. If SIBYL is extensively used in language definition, 
 someone should design a package of routines for this purpose. 
 It might also be useful to find a way of building semantic 
 level numbers into strings; e.g. 1' DELETE 1 or 1DELETE for a 
 nonterminal, 0' = ' or 0= for a terminal, and 2 1 -*- 1 or 2+ for a 
 symbol used in the grammar. 
 
 7. It would be useful to find a way to simulate 
 alternate modes of indexing in SIBYL. For example, in SIBYL, 
 single elements, rows, and columns of a 2-dimensional array 
 are referenced by expressions of the form A(i)(j), A(i) , and 
 (A@j) respectively. In many languages expressions of the 
 form A(i,j), A(i,%), and A(%,j) are standard. These could 
 
 be used in SIBYL by making A a procedure that references an 
 array-type data area. 
 
 Comments . 
 
 These language descriptions provide constructive defini- 
 tions: if a working SIBYL interpreter is available, these 
 definitions can be used to implement the languages (under the 
 assumption, of course, that errors have been removed). Of 
 great help to any such project would be the availability of a 
 package of routines designed to help in language definitions , 
 as was discussed above. For example, such a package should 
 include standard set operations. 
 
76 
 
 SIBYL did a good job of defining languages, and the 
 grammar proved to be both flexible and clear. In both cases 
 some improvements can be made, as detailed in the sections 
 above. The actual translation process took about two weeks 
 for each language. The first step was to obtain a syntax; 
 then the right sides of productions were translated into 
 SIBYL. By this time it became clear that certain changes 
 should be made to the syntax, and the process was repeated. 
 Finally the procedures were designed, and again some modifica- 
 tions of earlier work became desirable . In this way some 
 of the most complicated language features were redesigned as 
 many as three times. Under these conditions, the grammar 
 helped by naturally structuring the problems, and the 
 versatility of SIBYL proved sufficient for all translation 
 tasks . 
 
 The great bulk of the real problems in translating these 
 three languages were caused by a small percentage of the 
 language forms. It should be possible to decrease the 
 amount of SIBYL code by over a third by eliminating only a 
 few of the language features (for example, by simplifying 
 SQUARE argument lists). This is important, because it has 
 been estimated that the vast majority of queries actually used 
 in a real -world environment are simple [3] . A useful query 
 language can therefore be implemented (perhaps for demonstra- 
 tion or experimental purposes) much more cheaply than, say, 
 all of ALPHA. 
 
BIBLIOGRAPHY 
 
 [lj Aho, Alfred V., and Jeffrey D. Ullman, The Theory of 
 Parsing, Translation, and Compiling, Vol. I: 
 Parsing , Prentice-Hall, Inc. (1973). 
 
 [2] Boyce, R. F., D. D. Chamberlin, W. F. King, and M. M. 
 Hammer, "Specifying Queries as Relational Expres- 
 sions," Proc ACM SIGPLAN/SIGIR Interface Meeting 
 (1973), pp. 31-40. 
 
 [3] , "Specifying Queries as Relational Expres- 
 sions: the SQUARE data sublanguage," Proc 1974 
 ACM-SIGFIDET Workshop , Ann Arbor, Mich. (1974), 
 pp. 249-64. 
 
 [4] Chamberlin, D. D., and R. F. Boyce, "SEQUEL: A struc- 
 tured English query language," Comm ACM 18 (1975), 
 pp. 621-8. 
 
 [5] Codd, E.F., "A Relational Model of Data for Large 
 Shared Data Banks," Comm ACM 13 (1970). 
 
 [6] , "A Data Base Sublanguage Founded on the 
 
 Relational Calculus," Proc 1971 ACM-SIGFIDET 
 Workshop on Data Description, Access and Control , 
 ACM (1972); also IBM Research RJ893 (1971). 
 
 [7J , "Seven Steps to Rendezvous with the Casual 
 
 User," IBM Research RJ 1333 (1974). 
 
78 
 
 [8] Date, C. J., An Introduction to Database Systems , 
 
 Addison-Wesley (1975). 
 [9] Kampen, G. R., "A Formal Definition of the SIBYL Pro- 
 gramming Languages," Univ. of Illinois Dept . of 
 Computer Science, UIUCDCS-R-77-852 (1977). 
 
 [10] , "A SIBYL Tutorial" (available from the 
 
 author, Dept. of Computer Science, Univ. of Illinois) 
 
 [ll] Lacroix, Michel, and Alain Pirotte, "Example queries 
 in relational languages," Technical Note N107, 
 MBLE Research Laboratory, Brussels, Belgium (1978). 
 
 [12] Martin, James, Computer Database Organization , Prentice- 
 Hall, Inc. (1975). 
 
 [13] Schubert, R. N. , "The VL Relational Data Sublanguage 
 for an Inferrential Computer Consultant," Univ. 
 of Illinois Dept. of Computer Science, UIUCDCS-R- 
 77-846 (1977). 
 
BIBLIOGRAPHIC DATA 
 SHEET 
 
 1. Report No. 
 
 UIUCDCS-R-78-936 
 
 3. Recipient's Accession No. 
 
 4. Title and Subtitle 
 
 A TRANSLATION OF RELATIONAL DATABASE 
 QUERY LANGUAGES INTO SIBYL 
 
 5. Report Date 
 
 July 1978 
 
 6. 
 
 7. Author(s) 
 
 8- Performing Organization Rept. 
 
 David W. Dawson 
 
 N 
 
 9. Performing Organization Name and Address 
 
 Department of Computer Science 
 
 University of Illinois at Urbana-Champaign 
 
 Urbana, Illinois 61810 
 
 10. Project/Task/Work Unit No. 
 
 11. Contract/Grant No. 
 
 12. Sponsoring Organization Name and Address 
 
 13. Type of Report & Period 
 Covered 
 
 14. 
 
 15. Supplementary Notes 
 
 16. Abstracts 
 
 This report contains a formal description and comparison of three very 
 different query languages designed for use with relational databases: Codd's 
 ALPHA language, Michalski's VL, and SQUARE. Each language is defined by a 
 syntax-directed translation schema that maps queries into semantically 
 equivalent statements in SIBYL, an experimental programming language. 
 
 The report includes an informal summary and evaluation of each of the 
 query languages, of the translation approach to language definition, and 
 of SIBYL as a tool for the formal definition of semantics. 
 
 17. Key Words and Document Analysis. 17a. Descriptors 
 
 syntax, formal semantics, query language, relational data base, programming 
 language, language definition 
 
 17b. Identifiers/Open-Ended Terms 
 17c. COSATI Field/Group 
 
 
 
 18. Availability Statement 
 
 19. Security Class (This 
 Report) 
 
 UNCLASSIFIED 
 
 21. No. of Pages 
 
 
 20. Security Class (This 
 Page 
 
 UNCLASSIFIED 
 
 22. Price 
 
 FORM NTIS-35 ( 10-70) 
 
 USCOMM-OC 40329-P7 1 
 
wnv 1 J 1978 
 
AUG itft.