L I B R.AFLY OF THE " UNIVERSITY Of ILLINOIS . 510. 34 U6.r , no. 524- 33d cop. 2 * > *%. The person charging this material is re- sponsible for its return on or before the Latest Date stamped below. Theft, mutilation, and underlining of books are reasons for disciplinary action and may result in dismissal from the University. University of Illinois Library m? JUN 2 3 jiwn^^fEBi, WAY 2 9 Rpl? OCT 2 1 1970 NOV 1 4 1970 NOV 1 DEC - 7 Or m * t57i SEP 3 1977 NOV 1 2 V87 jp 20 JUL 1 19 HAR i 4 19?8 MAR i RBI I 2004 - ?/(iH FEB - 1 MM JAN 2 ^> ««2 5«BciJ" WAR 8 MAR 8R MAR n wr APR 2 U* '' - KP« 4 AMU-JBII A9T7 L161— O-1096 Digitized by the Internet Archive in 2013 http://archive.org/details/tacostabledriven325gaff TACOS: A TABLE DRIVEN COMPILER -COMPILER SYSTEM by John Lawrence Gaffney , Jr. JUL 17 1969 June, 1969 Report No. 325 TACOS: A TABLE DRIVEN COMPILER -COMPILER SYSTEM* by John Lawrence Gaffney, Jr. June, 1969 Department of Computer Science University of Illinois Urbana, Illinois 6l801 * This "work was supported in part by the Advanced Research Projects Agency as administered by the Rome Air Development Center under Contract No. US AF 30(6O2)klkk and submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science, June, 1969* Ill ACKNOWLEDGEMENT I wish to express sincere gratitude to Professor C. W. Gear for his guidance and many helpful suggestions during the preparation of this thesis, and also to Professor J. R. Phillips, who first suggested the system as a thesis topic. Additional thanks go to Mr. P. A. Alsber^ for his aid in the design of the metalanguage. Furthermore, I am indebted to Mrs. Coni Allen, who did such a fine job of typing this report, and to my patient wife, Angie, who spent many lonely hours as a "thesis widow." iv ABSTRACT This thesis is a description of and specification for TACOS, an interpretive compiler -compiler system employing a recursive-descent parsing algorithm. In its current implementation in PL/l on the IBM SYSTEM/360, a modified BNF grammar and PL/l semantic routines provide the specifications for compiler generation. The author has intended that TACOS be a general purpose compiler-generation system independent of implementation. To this end, the metalanguage and parsing algorithm are presented from a specification rather than an implementation point of view. In contrast, the semantics are regarded as too strongly tied to the implementation language to adhere to a general specification, and are, therefore, discussed in relation to the current PL/l implementation. TABLE OF CONTENTS Page 1. INTRODUCTION 1 1 1 . 1 On Compiler -Compilers 1 1.2 Non-Predictive Analysis 2 1.3 Predictive Analysis 3 1.4 Earlier Work „ k 1.5 Objectives 5 2. SYSTEM OVERVIEW 6 2.1 TACOS Origin ... 6 2.2 TACOS Structure 8 3. THE METALANGUAGE 9 3*1 Phrase Structure Grammars and BNF ., 9 3.2 TACOS IBNF » 10 3.2.1 General Form .. 10 3.2.2 Repetition Characters +, *, ? 12 3-2.3 Identifier <*3>, Integer <*N>, and String <*S> Symbols l4 3.2.4 Semantic Test <#n> 15 3.2.5 Parenthesized Definitions . l6 3-3 IBNF Restrictions 17 3>h SYNXTAB: The Syntax Table 18 k. THE SCANNER/PARSER , 2k 4.1 Introduction 2^ 4.2 TACOS Parsing Algorithm 26 4.3 The Scanner ■ 30 vi Page 5 . SEMANTIC SPECIFICATION 3I+ 5 • 1 Introduction 3I+ 5 . 2 Linkages to the Parser 35 5«3 The Semantic Procedure 36 6. DISCUSSION 39 6 . 1 Implement at ion 39 6.2 On Object Code Generation kl 6.3 On Error Recovery 1+3 6.1+ Extensions 1+6 APPENDIX A. PL/I IMPLEMENTATION OF 'PARSCAN' MODULE 1+9 B. TACOS IBNF SYNTAX SPECIFICATIONS TO GENERATE 'BUILD' MODULE.... 5I+ C. SEMANTIC ACTION PROCEDURE FOR 'BUILD ' MODULE 55 D. SAMPLE OUTPUT FROM GENERATED * BUILD ' COMPILER. 6l E. TACOS JOB CONTROL LANGUAGE AND FORM OF SEMANTIC SPECIFICATION.. 65 LIST OF REFERENCES 67 Vll LIST OF TABLES Table Page 1. Values of 5 according to IBNF item type.. 20 2. TACOS scanning routines and values returned 33 Vlll LIST OF FIGURES Figure Page 1. TACOS block structure as implemented «. 7 2. Syntax of IBNF written in IBNF 11 3- Structure of the TACOS syntax table SYNXTAB 19 h. Generated compiler operation 25 5. TACOS parsing algorithm 27 6. Flowcharts of the TACOS scanning routines 31 7« General form of the action procedure ACT 37 1. INTRODUCTION 1.1 On Compiler -Compilers A recent innovation in compiler design for programming languages has been the use of compiler-compiler systems (also termed compiler- generators or translator writing systems). The syntactic and semantic definitions of the language are input to such a system, generating a compiler which will parse the language source code according to its syntax and produce object code via the semantic specification. In general, compiler design flexibility is the primary attribute of compiler -generator systems. Whereas changes in the syntax recognition process of hand-coded compilers may be tedious, modifications in a generated compiler may be effected more easily through a change in the language specifications followed by a compiler regeneration. In fact, the objective in the design of TACOS (acronym for "table driven compiler -compiler system") was to provide an easy-to-use system with relatively fast compiler -generation time. This goal having been realized, TACOS is ideally suited to aid in compiler design. It may be used first to help debug the syntax specifications of the language as well as the associated semantic routines. Later, a hand-coded version of the compiler may be written to both increase compilation speed and improve source language error recovery. Two approaches are now common in the generation of parsing algorithms for the object compiler. The first is to produce a machine code parser directly from the syntax specification, commonly via assembly language, ALGOL, PL/ I, etc. The output is, therefore, itself a syntax- directed parser with appropriate linkages to the semantic routines. The alternate approach, as taken in TACOS, is the generation of a table which contains the information specified in the syntax. A fixed inter- pretive parser is provided which performs the scanning and semantic action routine calls according to the contents of the table. Each approach has its disadvantages, however. The interpretive system tends to be slower in source language compilation, whereas its table generation is rather fast. In contrast, the machine code parser usually necessitates an intermediate assembly or compilation, thus slowing down compiler-generation time. Naturally, however, the machine language parser is faster than the interpretive algorithm. Another differentiation between the types of generated parsing algorithms is the method by which the input source code is analyzed. In terms of recognizing the terminal symbols of the language, they are either "predictive" or "non-predictive" as to what is to be scanned at any particular time during compilation. 1.2 Non-Predictive Analysis Non-predictive analysis implies that, at each point in the parse, the type or value of the symbol(s) previously scanned uniquely determines which symbol(s) may follow according to the syntax specification of the language. In this sense there is no "guessing" involved. Exhaustion of the alternative symbols indicates a syntax error in the source language, whereas recognition yields a new "state" of the parse, and thus, a new set of next possible symbols. "Bounded context" or "bottom-up" translation are other terms applying to non-predictive methods. The primary disadvantage with this type of parsing algorithm is that the generation of the encoded syntax is usually tedious, if not difficult. Metalinguistically "nice" notations for syntax specification [ 2 ] such as Backus Naur Form(BNF) in general do not lend themselves to transformation into the "operator precedence matrices" or "Floyd [3] Production Language" ( FPL) commonly used with non-predictive algorithms. Once the syntax is encoded, however, non-predictive parsers are inherently faster and have source language error -checking built into the algorithm. As TACOS employs a "predictive" parser, no more will be said about non-predictive methods. 1.3 Predictive Analysis Predictive parsing algorithms, such as the one generated by TACOS, use a "trial and error" approach (a "top-down" or "recursive-descent" method) in scanning the source language. At any point in the parse, the algorithm attempts to recognize the alternate constructs in the input stream. However, failure to do so does not necessarily indicate a source language error. Rather, symbols previously recognized may be re-examined in terms of satisfying a different construct, and eventual recognition yields a successful parse. It is evident that top-down parsing involves a "back up and try again" procedure for each construct not recognized. This scanning inefficiency has two manifestations: slower compilation time and the lack of suitable error recovery for illegal source language constructs. The compiler never knows, in general, when an error has occurred unless it is specified in the language syntax as such. Compiler -generators such as TACOS which produce top-down parsing algorithms normally incorporate a rather simple encoding of the language syntax specification in contrast to bounded context schemes. Thus, top-down compiler generation is usually faster than bottom-up, a definite advantage where syntax debugging is the objective. 1.4 Earlier Work Since their inception and first implementation in the early 1960' s, compiler -generators and the related syntax analysis have gained in both sophistication and usage. Brooker and Morris ' ' were among the first to design and implement a compiler -compiler system (1960,1961). Utilizing a top-down parsing algorithm, they provided a very general system in that both the syntax definition statements and semantic routines were written in an essentially machine-independent notation. A contrast to the top-down approach found in the Brooker and Morris [7] system, a 1961 paper by Irons provided an analysis of a bottom-up compilation scheme for ALGOL 60. Incorporation of the semantic routines was not discussed, whereas the transformation of the syntax specification into its encoded form was covered in some detail. Again, the encoding problems of non-predictive analysis were evident. Warshall and Shapiro provided a very concise description of a ro I top-down compiler -generator system in their 1^6k paper. It provided a very "clean" notation for syntax specification with a modified BWF as well as understandable semantics using an ALGOL-like language. An improvement over the Brooker and Morris system was thus realized. Of perhaps the most relevance to TACOS was a I96U paper by [9] Cheatham and Sattley , wherein a general survey of top-down syntax- directed compiling techniques is given and a specific predictive analyzer proposed. A number of TACOS characteristics and features find their origins in this analyzer, particularly in the specification of the metalanguage . 1.5 Objectives This thesis will be concerned with both the algorithm and implementation of TACOS. Section 2 will give an overview of the system in relation to its beginnings as a syntax-directed recognizer and its evolution into a usable system. The TACOS metalanguage and its trans- formation into encoded form will be dealt with in Section 3> whereas Section k will be concerned with the generated compiler's parsing algorithm and scanning routines. Provision for incorporating the semantic routines into the compiler and the necessary linkages involved will be introduced in Section 5« Section 6 will discuss TACOS as implemented on the IBM SYSTEM/360 with conclusions directed toward possible extensions of the system. 2. SYSTEM OVERVIEW 2.1 TACOS Origin A term project for CS306, a course on operating systems and compiler theory taken by the author at the University, was the implementation in PL/l of a recursive top-down syntax recognition program proposed by Gear . The program involved the transformation of a modified BNF into a linked-list table by which the recognizer would parse an input character string into its components and invoke action routines (semantics) at appropriate points in the parse. This procedure was not suitable, however, for any practical compilation applications in that it was, in the terminology of IBM OS/360, a "compile-linkedit-go" program. Changes in either the syntax structure or semantics necessitated a complete compiler regeneration. A seminar course taken soon thereafter by the writer provided the opportunity to expand the recognizer into a usable system which was to be available to class members for the design and implementation of a compiler for a generalized mathematical language. Since it was anticipated that each student would be responsible for the syntax and semantics of a certain portion of the "math language" translator, the primary goal in the expansion of the recognizer was to facilitate changes in both syntax and semantic specifications. The obvious approach to doing so was to modularize the system, as illustrated in Figure 1. SYNTAX SPECIFICATION (IBNF) SYNTAX TABLE BUILDER --* (BUILD) LINKED-LIST SYNTAX TABLE ( SYNXTAB ) SEMANTIC ACTION ROUTINES (ACT) (PL/I) COMPILER I INTERMEDIATE ACTION MODULE PARSER/ SCANNER MODULE (PARSCAN) LINKAGE EDITOR LANGUAGE SOURCE CODE GENERATED COMPILER T -^ IS INPUT TO -^ GENERATES LANGUAGE OBJECT CODE Figure 1. TACOS block structure as implemented. 8 2.2 TACOS Structure BUILD is a compiler (itself generated by TACOS) which accepts the syntax specification data (LBNF, as described later) and creates from it a linked-list table (SYNXTAB) recognizable by the parser. This data set is placed on backup storage until needed. Semantic action routines (ACT), written in a high-level language (PL/l), are input to the compiler for generation into an inter- mediate object module. The linking program inputs this, as well as the previously compiled "recognizer" (named PARSCAN), and produces a load module which is the generated compiler. At compiler execution time, SYNXTAB is read into the generated compiler, parsing proceeds on the language source input, and the semantic routines create the appropriate object code. This modular structure allows the compiler -de signer to effect changes in the syntax by a simple regeneration of SYNXTAB via BUILD and in the semantics by a recompilation and linking of the action routines. 3. THE METALANGUAGE 3.1 Phrase Structure Grammars and BNF Chomsky is generally credited with the formalization of the phrase structure grammars by which most languages are formally defined. In general, such a grammar (G) is defined as a quadruple: G = , where V = f t . 1 < i < n~| = the set of all terminal symbols of the language, V = y"q.. 1 < i < m"| = the set of all non-terminal symbols of the language, S e V = the "root" non-terminal symbol, and P = a set of productions by which the structure of the language is defined. Restrictions on the form of these productions determine the formal class of the grammar. Most programming languages can be defined in terms of "context free" (CF) grammars, which adhere to the following restrictions on the form of the production l) I can be only a single V„, and •2) t] is a (non-empty) set of V^ and/ or V„. 10 BNF provided a standardization in CF notation and has since been universally accepted as one of the best means of syntax specification for programming languages. It was this which prompted the author to choose it as the basis for the TACOS metalanguage. 3-2 TACOS IBNF 3.2.1 General Form BNF in its pure form is unacceptable as input to most top-down compiler-generation systems, TACOS included. As BNF requires the explicit naming of all phrase classes, a "large" syntax specification can become quite lengthy. The most serious problem with BNF, though, is that left -recursive phrase class definitions can cause infinite looping of predictive parsers. Certain modifications in BNF were thus justified for the design of the TACOS syntax specification. In that the TACOS syntax language is input to an interpretive parsing algorithm, the author has chosen to denote this an "interpretive" BNF (IBNF). It retains the same form as the pure BNF but allows parenthetical expressions to shorten the specification as well as additional notations to circumvent the left -recurs ion problem. In reference to Figure 2, production (l), the IBNF syntax is defined (in its own syntax) as one or more productions, i.e. phrase class definitions, followed by the character string "END". It is immediately obvious that IBNF terminal symbols are written as literals enclosed in single quotes. This is a notation which eliminates ambiguities 11 c- A QC UJ h- O < a: < I u i i < UJ Q. UJ • » ••> •*■ CC V • «-« A • ** • _l • V " # ""• o CO A — . A s: Z A Z > c • •• UJ o t/> A >-H > • W I-* 1 O0 h- ►-H 1— —i • •> tt ►—i h- *— 1 < V Z • < z z t— ■ 4 A Z H— 1 •— i — LL — ct: O' u_ 2: • V- UJ UJ A UJ CC • • Q A h- UJ c UJ A A V UJ _l z V t— • 00 •z. < H^ - V # - < V 1— -~ ••> A V • II z O • — UJ • o • • 1 - o Z * •• oo — CC — A N^ — UJ m OO — 1 f- 1— - < «-» z A 00 • D » A _i o 00 UJ + O A + 00 o A ¥—4 00 f- • •• • QC Z A 00 1 UJ h- < 1 1 * Z < LU > O -J o — Z V c —1 00 •— « < o >— * o • M o < • •> H- V 1 H » M 1- 1 ct: < UJ Z A c« h- — o UJ I z + 00 < Z - O O 00 a CC A < r •if- < • o < V UJ s: CC UJ V — V A c a: A H- UJ X 00 H4 c* X « *-M _J I— a. V • •• • ■fr a. a. V * < ►— ■ V =fc tt V V V V - V V V •— — - " - • ii ii ii A II A UJ SL < Z 1 II II n II A UJ z II A a: UJ f— o II A II A _J O CO A 00 OO A 1— 1 < 00 z X A 00 00 A UJ I— a: UJ > < Z < < Z > z> < 1— O0 h- o _J -J o i— < o X 1 1 Z 1— t O u H-4 1— CC o o _l > 1— 1 1 h- < 1 1 1—1 < on o UJ UJ *— * ::= | ; ( )#*?+ <*i> <*n> and <*S>). A single quote within a literal is represented by two adjacent single quotes. Again in reference to Figure 2, an IBNF production (2) is defined as the same as in BNF except for the semicolon delimiter, which allows for a free-format syntax specification. The phrase class definition (5) consists of one or more alternatives, separated by "or" ("|") characters, as in BNF. Each of these alternatives (6) is defined as one or more items followed by the optional designation of a semantic routine to be invoked by the parser after all items of that alternative have been successfully recognized. Production (7) defines the possible IBNF items within an alternative, which are themselves defined in (3), (10), and (11). 3*2.2 Repetition Characters +, *, ? The "+" encountered in production (l) is a "repetition character" (9) denoting that the phrase class must be recognized at least once, but alternatively more than one time. In other words, an IBNF specification must contain at least one production. This is a convention adopted to specify repetetive constructs while avoiding left -recursive phrase class definitions. Thus: (BNF) : := | is written (IBNF) ::= + ; . 13 In the same vein, a "*" repetition character (as implemented by Cheatham and Sattley) indicates that a phrase class may appear zero or many times, consistent with the Kleene star (*) notation for regular languages . Therefore : (BNF) ::= | becomes (IBNF) ::= * ; . This repetition character applied to parenthesized expressions is particularly useful in defining sequences of the same construct with intervening delimiters. As an example, Figure 2, production (5)> : := ( ' | ')* ; defines to be one or more , s separated by "or' characters by utilizing the "*" notation. Finally, the (Brooker and Morris) repetition character "?" indicates a possibly empty phrase class. Therefore: (BNF) ::= | is written (IBNF) ::= ? } . Ik 3-2.3 Identifier <*!>, Integer <*N>, and String <*S> Symbols A metalinguistic shortcut was included in TACOS in an attempt to reduce both the volume of the syntax specification as well as parsing time in the generated compiler. As identifiers, unsigned integers, and literal strings are uniformly defined for nearly all programming languages. These items were included as intrinsic terminal symbols (ll) recognizable [9] in an IBNF specification. The Cheatham and Sattley paper provxded these symbols in their proposed system, although the current IBNF notation [12] is credited to Trout Use of these symbols causes the TACOS parsing algorithm to invoke recognition routines that interface with the input source la'nguage. i) <*3> causes the identifier recognition routine to be invoked; syntactically, <*I> = ( | | *_' )*, where is any of the alphabetic characters and is o|l|2|...|9. The underscore "_" is due to the present PL/l implementation. ii) <*N> signals the parser to attempt recognition of an unsigned integer. Thus <*N> = K iii) <*S> represents a literal string consisting of any sequence of characters (including none) enclosed in single quotes. Two adjacent single quotes within <*S> are used to represent a single imbedded quote. 15 The length of the character strings that can be recognized is, of course, implementation-defined. 3.2.U Semantic Test <#n> Production (lO) is of particular interest in that the use of the semantic test is closely allied with the generation of a more efficient compiler. Its function is to call semantic routine n before all of the items within an alternative have been parsed and to allow this routine to terminate the parsing of that alternative. This is a very desirable feature after recognition of <*l>'s has taken place. The semantic test routine would, for instance, reference a name table to check the predefined attributes of the <*I> just recognizedo If they conflict with the intended goal of that parse, the semantic routine flags the parser to terminate that parse and to go attempt another. This greatly reduces the number of unsuccessful parses and in doing so increases compilation speed and efficiency. If a normal semantic action is to be invoked after successful recognition of an alternative, it is specified at the end of the alternative (Figure 2, productions (6), (8)). It is to be noted that no conditional or normal semantic routines have been explicitly specified in the IBNF syntax of Figure 2. The reader is referred to Appendix B for sample inclusions of these routines. 16 3*2.5 Parenthesized Definitions The BNF restriction of having to explicitly name all phrase classes places a certain burden on the compiler -writer when specifying any "large" syntax. As a desirable alternative, IBNF allows parenthetical expressions within phrase class definitions which may be nested to any level. Therefore, the (BNF) construct : := <0 | | may be expressed in IBMF as ::= (<0 | | ) ; without having to generate an additional phrase class name (say a) and write it as ::= <££> <££> ::= <0 | | . In IBNF, a parenthesized expression is syntactically equivalent to an item within an alternative (Figure 2, production (7))« It may, therefore, be followed by a repetition character (see 3*2.2) which now applies to the entire construct within the parentheses. Notice that this construct is itself equivalent to an entire right -hand -side of a production. In this way, the designer is saved the trouble of creating a new phrase class name, as in the above example. 17 3.3 IBNF Restrictions Due to the nature of both IBNF and the TACOS parsing algorithm, several restrictions are placed on the form of the syntax specification. Although left -recursive phrase classes are syntactically allowable, they can cause infinite looping of the parser. More subtle, but just as pathological, is implicit left-recursion such as: : := <0 : := where phrase class is left-recursive through . In the current TACOS implementation, all instances of left-recursion are detected in the BUILD module and each phrase class involved is flagged. If the input source language is to be parsed correctly, it is essential that the longest alternatives having identical beginning items be placed first in the production. Although proper ordering is not required for valid syntax definition, the parser demands that the longest alternatives be placed first so that it will not "quit too soon." Example : An IBNF production defining the possible PL./ 1 relational operators should be written ::= '<=' | '>=« | '=' | '<' | *>' | '— •= " ; rather than 18 ::= '=' '<' *> f '<=« '>=» j '-,=« ; although either is syntactically correct. As will be seen in the following section, the parser attempts recognition of the input stream (source language) character -by-character beginning with the first alternative in the production. Therefore, if '<' had been specified "before" '<=' (as in the second production above) and the input character stream is something like <= COS (THETA)... , an incorrect parse will result since the "<" will cause successful recognition of the '<' alternative of . Unfortunately, the "=" will not be parsed in the correct manner, giving anomolous results. In general, an IBNF production should be written such that the "longest" parses are attempted first. 3-k SYNXTAB: The Syntax Table As previously discussed, BUILD is a TACOS module which transforms the IBNF syntax specification into a table, SYNXTAB, which the generated compiler uses to control its parsing mechanism. • SYNXTAB (see Figure 3) is actually a set of three linked tables: 1) a PHRASE CLASS NAME TABLE, 2) a LINKED-POINTER ITEM TABLE, and 3) a TERMINAL SYMBOL TABLE. r- a ~\ PHRASE CLASS NAME TABLE (FROM 5 FOR NON- TERMINAL SYMBOLS ) L r (FROM 6 FOR TERMINAL SYMBOL ITEMS) A Sw, L ^ \ ALTERNATIVE HEADI 1 ]R 5 T 1 ITEM ENTRY A 19 a = PHRASE CLASS NAME A = ACTION ROUTINE NUMBER 5 = (DEFINED IN TABLE 1. ) T = ITEM TYPE CODE p = TERMINAL SYMBOL \ = NULL POINTER X ALTERNATIVE HEADER 5 T r 5 T X ITEM ENTRY LINKED -POINTER ITEM TABLE p TERMINAL SYMBOL TABLE Figure 3- Structure of the TACOS syntax table SYNXTAB. 20 ITEM TYPE 5 IN ITEM ENTRY TYPE CODE (T) < a > -+ a IN PHRASE CLASS NAME TABLE < a > ? -> a IN PHRASE CLASS NAME TABLE 1 < a > + - a IN PHRASE CLASS NAME TABLE 2 < a > * -* a IN PHRASE CLASS NAME TABLE 3 < * i > \ k < * N > \ 5 * P ' -»> p IN TERMINAL SYMBOL TABLE 6 < * S > \ 7 <#n> n 8 -* = "A POINTER TO" 5 = LEFT POINTER OF ITEM ENTRY a = PHRASE CLASS NAME = TERMINAL SYMBOL n = INTEGER > \ = NULL POINTER Table 1. Values of 8 according to IBNF item type, 21 Tables l) and 3) contain an entry for each non-terminal V and terminal symbol V specified in the syntax, and table 2) acts to define the syntactic constructs which tie l) and 3) together. It is, in essence, a direct mapping from P, the production set. The primary constituent of table 2) is an entry for each item (Figure 2, production (7)) encountered in an IBNF alternative. Its structure is : LEFT POINTER (8) TYPE CODE (T) RIGHT POINTER ITEM ENTRY The contents of LEFT POINTER (5 in Figure 3) and TYPE CODE (T) for each item type are given in Table 1. As BUILD compiles the IBNF specification, it generates such an ITEM ENTRY for each item in the production. Since several alternatives may constitute a phrase class definition, a second type of entry termed an "ALTERNATIVE HEADER" is necessary in table 2) to point to each alternative's item list and to specify the action routine to be invoked if successful recognition of the alternative has taken place. Therefore, BUILD creates the following entry for each alternative in an IBNF production: FIRST ITEM ACTION ROUTINE (A) NEXT ALTERNATIVE ALTERNATIVE HEADER FIRST ITEM is a pointer to the first item of that alternative (see Figure 3)» ACTION ROUTINE (A) equals the unsigned integer n specified in Figure 2, production (8) as the semantic routine to be invoked (zero if none is desired), and NEXT ALTERNATIVE points to the next ALTERNATIVE HEADER (again being zero if it is the last alternative). NAME DEFINITION POINTER 22 The first ALTERNATIVE HEADER of a phrase class definition is referenced by a pointer in the corresponding entry in the PHRASE CLASS NAME TABLE. Each such entry has the following format: PHRASE CLASS ENTRY NAME contains the phrase class name as specified in IBNF, and DEFINITION POINTER provides the link to the first ALTERNATIVE HEADER of the definition. It is pertinent to mention at this time the action taken by BUILD when parenthesized definitions (Figure 2, production (7)) are encountered in IBNF, as the direct mapping from IBNF to SYNXTAB is altered slightly. As specified in the IBNF syntax, a parenthesis pair syntactically encloses a full phrase class definition: »(' »)' ? . Therefore, upon recognition of a left parenthesis in IBNF, BUILD assigns a "dummy" phrase class name to reference the enclosed and performs two actions: 1) It creates a normal phrase class entry (TYPE CODE 0-3, as determined by the ) in the LINKED -POINTER ITEM TABLE, and 2) It enters the "dummy" name in the PHRASE CLASS NAME TABLE with appropriate linkages to its "definition" enclosed in parentheses. 23 In this manner, SYNXTAB appears exactly as it would had all phrase classes been explicitly named. 2k k. THE SCANNER/PARSER 4.1 Introduction General compilation schemes employ two mechanisms in their processing of the input source language. The first is a scanner which breaks up the input character stream into "atoms" (terminal symbols) and whose design is dependent upon the language. These atoms are then analyzed (predictively or non-predictively) by the second mechanism, the parser, according to the syntax of the language. A compiler generated by TACOS utilizes an altered version of this scheme in that no scanner as defined above exists. Instead, an interface between the input character stream and the parser is established through a set of four recognition procedures which are invoked when specified in the IBNF syntax. They attempt recognition of 1) identifiers 2) unsigned integers 3) literal strings h) specified terminal symbols, and return an appropriate "success" or "failure" indication to- the parser. In terms of TACOS compiler operation, the parsing algorithm (called PARSE) is invoked by a compiler "driver" which is also responsible for the input of SYNXTAB and the source language. The combination of the driver, PARSE, and the four interface routines (collectively called SCAN) constitutes PARSCAN, the fixed interpretive parsing and scanning 25 <: "^ generates "^ is read by programmatic ally linked LANGUAGE SOURCE CODE r u. SYNTAX TABLE (SYNXTAB) ^k. DRIVER SCANNER '. PARSER (SCAN) i (PARSE) ft (PARSCAN) it EXTERNAL LINKAGE HEADER ACTION ROUTINES (ACT) COMPILER MODULE i LANGUAGE OBJECT CODE Figure k. Generated compiler operation. 26 module of TACOS. In conjunction with the semantic action procedure (ACT), PARSCAN performs the compilation of the source language into object code under the direction of SYNXTAB, as illustrated in Figure h. k.2 TACOS Parsing Algorithm As previously descussed, the TACOS parsing algorithm (PARSE) is an interpretive procedure whose actions are governed directly by SYNXTAB. Given a phrase class definition via an IBNF production, it attempts recognition of that phrase class in the input symbol stream and executes the semantic routines if the parse is successful., It is, therefore, evident that compilation is initiated by the PARSCAN driver when it calls PARSE to recognize the source language given the "root" phrase class of its syntax. Since other phrase classes are allowable in a phrase class definition, PARSE employs recursive descent in their recognition; i.e., it calls itself to go recognize other non-terminals. The other item types as introduced in 3*2 also control the parser accordingly. Specification of <*I>, <*N>, <*S>, or a terminal symbol causes the interface routines to be called, whereas a conditional semantic routine <#n> initiates a semantic call. It is important to note that the design of the TACOS parsing algorithm is such that all blank characters in the source language, with the exception of those within literal strings, are ignored. The parser maintains a pointer (INP) to the input character stream location to be recognized next. Each interface routine, when invoked, increments 27 delete blanks and save INP, 6 i save ALTERNATIVE HEADER save ITEM ENTRY f RETURN J call ACT (A) Figure 5. TACOS parsing algorithm. _£-£ fcJcall PARSL(6 28 OK- 1 ( ROUTINE \ ^^ f \u ROUTINE call CONST m—jL allSTRTNO(B) <*B> ^1 call LITERAL] OK- 1 ^Ica all ACT(c-n)] ►(( TEST ]) Figure 5"b. TACOS parsing algorithm. 29 INP to the first non-blank character before scanning, as does the parsing algorithm at each non-terminal call. A flowchart for PARSE, the TACOS parsing algorithm, is given in Figure 5. The variables referenced are those given in Table 1 and Figure 3 as defined for the table SYMXEAB. The identifier OK indicated in the flowchart is a boolean variable taking on the value of "l" for an item "recognized" and "0" for one "not recognized." OK is set =1 before parsing begins. In general, each call on PARSE corresponds to an attempt to recognize one phrase class according to its definition in the IBNF syntax. Each alternative is taken in turn "from left -to -right" in the production. Likewise, the type of each item (t) within the alternative controls the actions of the recognizer. Should another phrase class be specified as an item (0 < T < 3) the recognizer recur ses to go and attempt another parse according to its definition. However, if a terminal symbol (k < T < 7) is indicated as an item, the parser executes the appropriate interface routine. Similarly, if a conditional semantic routine is encountered (T = 8), it is executed and the result noted. If the recognition of any item fails, the recognizer discontinues the parse within that alternative, resets the input pointer to its location at the beginning of the call, and tries the next alternative. If all alternatives fail, the phrase class itself is considered "not found" and the parse returns with this indication (OK = 0) to the point at which the phrase class was called. 30 If all the items within an alternative have been recognized, the (optional) action routine is executed (with call ACT (A)) and the phrase class is considered recognized. Note that no additional alternatives are attempted, thus necessitating the previously discussed IBNF restriction that the alternatives giving the longest parse be placed first. U.3 The Scanner The actual scanning of the source language character stream is performed by the four interface routines (SCAN) provided within PARSCAN. These are given in Table 2. Each sets the success parameter, OK, for control of the parser; in addition, the first three return the character strings recognized as being identifiers, integers, or literals for use in the semantic routines. Therein, the identifiers may be cross-referenced in a name table, the integers converted to their binary values, or anything the compiler-designer wishes. It should be noted that additional scanning routines may easily be written by the compiler -de signer in the form of conditional semantic routines. Since the input pointer, INP, can be manipulated in the semantics, special scanning procedures such as for comments or control cards may easily be included in the system. Figure 6 shows the flowcharts for the four routines given in Table 2. CHAR is an identifier denoting the source input buffer, and INP is the pointer previously discussed giving the location of the character being scanned within CHAR. Note that all four routines delete blanks before scanning commences. The array TCHR referenced in STRING contains the terminal characters of the language as defined in its IBNF specification. 31 TEMPIDENT •■ null string I delate blanks OK- OK- 1 add CHAR(INP) to TEMPIDENT m? - iNP+i TEMPCONST - null string I delete blanks OK- CHAR N. + add CHAR(INP) to tempconst imp- inp+1 0K«- 1 Figure 6a. Identifier scanner. Figure 6b. Integer scanner. 6. Flowcharts of the TACOS scanning routines, 32 TEMPSTRIHO *■ null atrlog I delate blanks OK*- IHP-*- HP+1 add CHAR(IRP) to DIP •" INP+1 I © delete blanki I 1 *- 6 IN? - IMP+1 I - 1+1 OK*" 1 OK*- figure 6c. Literal icanner. figure 6d. Terminal symbol icanner. 33 w f4 (D •H P a •H CL) E! w •H W OJ ft ca OO 60 C5 CO w H O H n5 •H 0) -p -3- -p CD U a (0 CO > cd w cd c •H M •H C! I CO o O OJ (TEMPIDENT) k. the character string returned by <*N> (TEMPCONST) 5- the literal defined in <*S> (TEMPSTRING) 6. the "success" or "failure" parsing parameter (OK). It will be seen that control over the input buffer and pointer is vital for source language error recovery as well as for any special scanning. Similarly, the character strings returned by the interface (SCAN) routines are needed for object code generation in the semantic routines. The ability of the semantic routines to alter the success parameter has already been discussed in terms of controlling the parser via conditional routines. However, there is nothing to prevent an unconditional semantic routine from doing likewise. The effect on the parser is different, though, than with a conditional routine. Since the unconditional routine is invoked after all items of an alternative have indicated "success," setting the flag to "failure" (0K*~0) at this time will cause the parser to terminate the parse at the next higher level of recursion. 36 Example : (1) ::= | <0 ; (2) ::= 'GOTO' <*N> #21 | •CALL 1 <*I> ; If action routine #21 in production (2) above had set 0K**0, the parser ■would consider not recognized even though 'GOTO* and <*N> had "been parsed. Therefore, recognition of in production (l) would not be attempted, and the parser would go on to try the next alternative, <0. Examination of the flowchart for the parsing algorithm in Figure 5 will clarify this. If #21 had been written as <#21> (i.e. as a conditional routine), the parser would have attempted the next alternative in that production: 'CALL' <*I>. Although the utility of this "trick" may seem obscure, it will be shown to be useful in source language error recovery. 5«3 The Semantic Procedure All semantic routines are collectively incorporated into one semantic procedure (ACT) in TACOS. Since it is compiled separately from the parser (PARSCAN), the necessary external linkages to it are established according to the machine implementation. See Figure 7« Each routine within ACT is addressable by its number and, therefore, requires a means of receiving control when the semantic procedure is called. Likewise, each routine relinquishes control back to PARSCAN by way of a procedure exit (i.e. return). 37 PROCEDURE LINKAGES AND DECLARATIONS BRANCH TO ROUTINE N ACTION ROUTINE Figure 7. General form of the action procedure ACT. 38 In the current implementation, the procedure heading and external linkages to the parser are already provided. The compiler designer needs only to write each action routine called for in the IBNF syntax. Reference to Appendix C shows such a semantic procedure (ACT) which is incorporated into the BUILD compiler generated by TACOS. Note that this implementation allows the semantic procedure to be recursive so that one action routine can call another. 39 6. DISCUSSION 6.1 Implementation Thus far, it has been the intention of the author to present TACOS as a system independent of its implementation. This is valid primarily when considering the structure of the IBNF specification, its tabular form as SYNXTAB, and the parsing algorithm. It is evident, however, that the author has taken advantage of the modularity allowable under the IBM OS/360 in the overall design of TACOS. The ability to easily link separately compiled procedures (namely PARSCAN and ACT) dictated the structure of the generated compiler. Likewise, the "job step" approach provided the incentive for the different compiler -building phases of TACOS as shown in Figure 1. Reference to Appendix A will show the PL/l implementation of the PARSCAN module. One will notice that, in the current version of PARSCAN, the entire source character string is input into the buffer CHAR before parsing is initiated. A maximum of 32,7^7 characters is imposed by PL/l on the length of the array CHAR, representing a limit of ^■09 source language cards when all 80 columns are to be scanned. Therefore, the present system is not well suited, buffer-wise, for compilation of extensive source code. Neither is it well suited time-wise. Current estimates for the BUILD module, the only operational TACOS-generated compiler now in use, indicate compilation rates of 300-350 lines (card images) per minute, prohibitively slow for over ^00 cards. ko One restriction in the present system should be well-noted. It is assumed by the driver in PARSCAN that the root phrase class of the language is located in the first entry of the PHRASE CLASS NAME TABLE. This requires that the first production in the IBNF specification be that root phrase class, which, fortunately, is consistent with the "usual" structure of BNF-like grammars. Several features have been added to the current system which enhance the operation of the generated compiler. Two are effected with IBNF control cards placed before the syntax itself (see Appendix D): 1) $CMARGE=(n,m) specifies that the compiler is to scan columns n through m of the source code (default is 1 through 72), and 2) $CTITLE= literal string causes the title denoted in the literal string to be printed at the head of the compiler's output listing. In addition, PARSCAN now prints the elapsed compilation time, which can be particularly useful in designing more efficient syntax and. semantic specification. As a further aid in printing error messages, two function routines were incorporated into PARSCAN to compute the card number (CARDNUM) and card column (CARDCOL) according to the location of the input pointer INP. These values are made available to the semantic routines for indication of source code syntax errors. kl Two areas have been more or less glossed over in earlier discussion because of their dependence on implementation: the procedure (BUILD) which transforms IBNF to SYNXTAB and the form of the routines which constitute the semantic procedure ACT. Appendices B, C, and D provide clarification in those two areas as well as an instructive example of a compiler (i.e., BUILD) generated by TACOS. Appendix B gives the IBEF specification of itself as input to an earlier version of the BUILD module, the object being to generate another version of BUILD. Appendix C shows the action routine (ACT) with the required semantic routines as specified in the Appendix B IBEF. They build the syntax table SYNXTAB and perform the error checking on the input syntax. The operation of the generated compiler is given in Appendix D, with the new BUILD module generating a syntax table, again, of IBNF. The necessary Job Control Language (JCL) in executing the different phases of TACOS is given in Appendix E. Incorporated into this is the "syntax" of the action routines in ACT, which naturally applies only for the current PL/l implementation of TACOS. 6.2 On Object Code Generation The compiler -designer is responsible in the action procedure ACT for the generation of the "one-pass" object code. In practice, a first pass of the compiler usually generates an intermediate code such as polish postfix or tree structures for later input to another compiler pass. In the case of the BUILD compiler module, the tree structure 1+2 SYNXTAB was generated. See Appendix C for the corresponding semantics. The necessary linkages to the parsing algorithm and symbols scanned in the source code have been discussed in 5.2. A number of other linkages have been provided in the current implementation to help not so much in object code generation, but in error-location: 1) CARDNUM returns the (fixed binary) card number being scanned, 2) CARDCOL gives the card column according to the current location of INP, and 3) GOCONDITION is an error indication flag. The setting of GOCONDITION=0 in the semantics is nothing more than a signal to PARSCAN that a source code error has been detected. After parsing terminates, PARSCAN prints an error message with an indication of how far into the source code parsing has been successful. As is typical of recursive-descent parsing, it is very difficult to know exactly where source code errors have occurred since the parsing algorithm automatically keeps "backing up" the input pointer to check other alternatives. The indication by PARSCAN of the maximum parse into the code is, therefore, very useful in detecting both source code and syntax specification errors. It should be remembered that, as shown in the flowcharts for the scanning routines (Figure 6), TEMPIDENT, TEMPCONST, and TEMPSTRING are nulled before their respective routines attempt any recognition. It is necessary, therefore, that the semantics -writer save those values returned (if needed) before another <*I>, <*N>, or <*S> scan is attempted. ^3 One will notice in Appendix C that the declaration of the compiler variables includes only (PL/l) STATIC and CONTROLLED identifiers. This is necessary in the current implementation so that they do not get re-allocated each time the semantic procedure ACT is invoked. The CONTROLLED variables are particularly useful for semantics written in PL/ I for the implementation of stack operations via ALLOCATE and FREE statements. 6.3 On Error Recovery Perhaps the weakest point in any top-down compilation scheme is a provision for recovery from source code errors. If no error detection is built into the syntax specification or semantics, the best the compiler can do is to indicate "yes" or "no" as to whether or not the source code was successfully recognized. It is, therefore, left entirely up to the compiler -designer to create his own error detection and recovery in the compiler specifications. The most the author can do in relation to TACOS is to provide some instructive examples of error detection in IBNF source code as implemented in the BUILD compiler (Appendices B and C). An obvious approach in error detection is to syntactically specify what an error looks like. Witness lines 8 and 9 °f Appendix B (the IBNF of itself): (8) ::= ( , < , <*I> , >« | <*f>*>*#2k | % < x <*T>lfeh)fe; (9) ::='::=« ('::' •:=• *= % )#2.h; kk It was judged through experience that a common error in an IBNF specification was the omission of one of the angle-brackets in the left-hand-side of a production. Likewise, the '::=' was often mispunched. The above syntax provides for recognition of these errors and executes action routine $2.k to print an appropriate error message (see Appendix C). An alternate and generally more difficult approach is to "trap" the parser upon detection of illegal source code through clever design of the syntax and semantics. From the same syntax specification in Appendix B, lines 6 and 7 > we have (6) : := ( • ; ' | <#17>)#21 | (7) (•EOT)' | , REG , )#25 |<#17>; ° At this point in the parse, an IBNF production, the primary component of the syntax specification, is being scanned. Should an error cause termination of this parse, it is naturally desirable to flag the illegal construct and continue with the next production. First suppose the error had occurred in the parsing of (where they usually do) and that at least one has been recognized in the phrase class definition. Since an has been recognized, the syntactic definition of has been satisfied, and the error causes the parser to return to with 0K=1 but with INP pointing to the place in the production at which it "got lost." Therefore, the construct (•;» | <#17>) h5 performs the error recovery as follows. As INP does not point to the end of the production, the ';' alternative fails, and error recovery procedures as specified in routine #17 are initiated to print an error message and to move INP to the end of the production. is now considered recognized, routine #21 is executed, and additional 's are attempted as desired. However, suppose the unrecognized construct had occurred earlier in the production so that the first alternative (in line 6) had failed. The next alternative (in line 7) ('END' | 'REG')#25 first determines whether or not the end of the IBNF production set has "been reached. If so, routine #25 backs up INP to reference the beginning of 'END' or 'REG' and sets 0K=0. In an earlier discussion in 5*2, it was noted that this terminates the parsing at the next higher level of recursion. Therefore, by setting 0K=0 in #25, the parsing of is discontinued and the 'END' or 'REGEN' is scanned next. If the second alternative (line 7) had failed, <#17> would perform the desired error recovery procedures as before. It is obvious that incorporation of suitable error recovery requires an intimate knowledge of the parsing algorithm as well as the "region" of the syntax at which it is being performed. In TACOS, judicious control of the OK parameter can alleviate the problem somewhat, although no effective error-recovery parsing procedures are offered in the present system. k6 6.U Extensions During the implementation of TACOS, the author had occasion to project into the future several possible extensions in the system that "would not only increase its efficiency, but also make the task of compiler design somewhat easier. These ideas are presented here with the implication that the author would like to incorporate them into the system at some future date. In the present IBNF specification, the compiler -writer is saved a certain amount of syntactic specification through the use of the special scanner symbols <*I>, <*IC>, and <*S>. An additional construct, that of the arithmetic expression, is also common to most programming languages. Therefore, the inclusion in IBNF of an additional scanning symbol to represent an arithmetic expression, say <*A>, would perhaps be justified if the semantic linkages could easily be specified. Syntactically, <*A> = ; ::= ((*+' | '-') )* ; ::= (('** | '/') )* ; ::= ('**' )* ; ::= | ' ( , * )' | (' + ' | «-') ; where the definition of would be specified separately. In the implementation of <*A>, perhaps it would be possible to include a bottom-up parser via an operator-precedence matrix for its inherent advantage in speed and source code error detection. hi As an additional IBNF specification, it could be useful to allow the syntax-designer to specify the definition of a "blank" to the parser and scanner. For instance, as PL/l assumes a comment of the form 1 /** /"any sequence of characters but */V*/' to be syntactically equivalent to a blank character, the corresponding IBM 1 for PL/ I would be very impractical to specify. As deletion of blanks is a built-in scanning function in PARSCAN, it does not seem unreasonable to incorporate such "comment" deletion within it and to allow its definition in IBM. For example, written as a $ control card, it could be $BLANK= • » | '/*' ... '*/' for the PL/l syntax. One of the most promising extensions of the system is the incorporation of a separate module that will translate the syntax table SYMTAB directly into a hard-coded (assembly language) compiler. This would provide an ideal option for that phase of compiler design where the syntax specification is nearly "frozen." The increase in parsing speed in changing into the hard-coded system would well justify this intermediate step, especially since SYNTAX would have already been created for prior interpretive use. Certain problems seem to be emerging, though, in implementing the assembly language parser and scanner. The first is the problem of providing efficient "recursiveness" to the machine-coded procedures kQ corresponding to each phrase class. The other is the virtual nightmare of trying to link an assembly language "PARSCAN" to a PL/l semantic routine. To date, TACOS has been utilized in a number of departmental projects, mostly involving experimental semantics for special language constructs. In particular is the project involving the design of a "mathematical language" mentioned in 2.1 and its implementation as a superset of PL/l. TACOS is being used to translate this language into corresponding PL/l "object" code. Moreover, it is expected that incorporation of the previously discussed ideas into later versions of TACOS will significantly extend its utility as a general -purpose compiler-generation system. ^9 APPENDIX A PL/I IMPLEMENTATION OF 'PARSCAN' MODULE PARSCAN: /* TACOS PARSER/SCANNER MODULE */ PROCEDURE OPTIONS(NAIN) ; DCL CONTROL CHAR(4>, INTAB INPUT STREAM FILE; OPEN FILE( INTAB) : GET PILE(INTAB) L I ST ( Nl , N2 , N3 , CONTROL ) ; /* ; */ /* THIS IS THE DRIVER PROCEDURE */ /* #/ VARBLK: BEGIN; DCL CHAR (32767) CHAR(l) EXTERNAL CONTROLLED, ACT EXTERNAL ENTRY, 1 TARLES(Nl), 2 NAME CHARACTER ( 32 ) , 2 LIS! FIXED BINARY, 1 LISTS(N2), 2UEFT, CENTER, R1GHTJFIXED BINARY, INP FIXED BINARY EXTERNAL, 1 BRIDGE EXTERNAL, 2 GOCONDITION FIXED BINARY INITIAL(l), 2 TEMPCONST VARYING CHARACTER ( lb ) , 2 TEMPIDENT VARYING CHARACTER ( 32 ) , 2 TEMPSTRING VARYING CHARACTER ( 100 ) , C1ITLF CHAR(70) INITU70)' •), FARTHEST FIXED BINARY INITIAL(l), CHR(iM3) CHARACTER! 1), DELIMITER CHAR(l) INITIAL!' '), ( TIMENEW,TIMEOLD) FIXED(31,0) BINARY, LMARG FIXED BINARY INIT(l) EXTERNAL, RMARG FIXED BINARY INIT(72) EXTERNAL, POST ENTRY RETURNS ( FI XED( 31 ,0 ) BINARY), (CARUNUM,CARDCOL) ENTRY RETURNS ( F I XED BINARY) EXTERNAL, CARDIM CHAR(80), FIRST FIXED BINARY, OK FIXED BINARY EXTERNAL; IF CONTROL='NOGO' THEN DU; PHI SKIP LIST (•SYNTAX TABLE IN ERROR. NO COMPILATION POSSIBLE.'); RETURN; END; ALLOCATE CHAR; GFT_TABLES: DO 1=1 TO Nl; GET FILE (INTAB) LIST(TABLES( I ) ); END GET TABLES; GET_LISTS: 00 I = 1 TO N2; GET FILE( INTAB) LIST(LISTSd) ); END GET_LISTS; GET FILE(INTAB) LIST( (CHR( I ) DO 1=1 TO N3) ); ON ENDFILEI INTAB) GO TO GET_INPUT; GET FILE(INTAB) L I ST ( LMARG, RMARG ) ; GET FILF(INTAB) L I ST ( CT I TLE ) ; GFT_IMPUT: ON ENDFILE ( SYSI N ) GO TO PARSE; 50 KK=l; K»l; NC(JLS1= RMARG-LMARG: NCULS=NCHLS1+1; NBBLANK=LMARG-1 ; PUT SKIP FUIT(CTITLE, •SEQUENCE IMUMRFrt« ) (A,X(20),A){ PUT SKIP (2); MiiRF: GET EDIT(CARDIM) (A(HO) ) ; PUT SKIP H)IT(CARDIM) (A(80) ) : PUT FDIT(KK) (X( 16) t F(3) ) ; KK = KK+l ; GET STRIN'G(CARDIM) FDI T ( (CHAR ( I ) 1)11 I = K III K+NC0LS1 ) ) (X(NHBLANK) , (NCULS)A( 1 ) ) ; K = K + NCllLS: GO Tfl MflRFl MAKSE: GnCnNniTI(iN=l : INP=1 ; FIRST=1: /* RFCOGNIZE FIRSI TAm|k |-mTRY */ TIMEOLD=PUST; CALL RPCflG(FIRST) ; TIMFMFW=PUST; IF TIMEnFWFARTHEST THEN FARTHEST* INP; IF H«0 THEN RETURN; ELSE DO ; HEAD*LISTS(H); GO TO NEXTl; END TO; TYPE(l) : TYPE(2) : CALL RECOG(STACK) ; OK=l; GO TO NEXT2; CALL RECOG(STACK) ; IF OK = THEN GO TO IMEXT3; ELSE DO; TYMEI3): REPEAT: CALL RECOG(STACK ) ; IF OK = l THEN GO TO REPEAT; OK = l; GO TO NEXT2; END; TYPE(A): CALL IDENT; GO TO TEST.IT; TYPE(b): CALL CONST; GO TO TEST_IT; TYPE(6): CALL STRING ( STACK } ; GO TO TEST_IT; TYPE(7): CALL NEWSTR; GO TO TEST_IT; TYPE(R): f)K = l; CALL ACT(STACK); GO TO TEST_IT; END RECOG; /* */ /* THESE ARE THE SCANNING ROUTINES (SCAN) */ /* */ STRING: PROCEDURE (I); /* THIS ROUTINE SCANS TERMINAL SYMBOLS */ DCL I FIXED BINARY; DELETE: IF CHAR(INP) = » • THEN DO; INP=INP+l;GO TO DELETE;ENU; AGAIN: IF CHAR( INP)=CHR( I ) THEN DO; INP=INP+1 ; 1=1+1; GO TO AGAIN; END; IF CHR( I )=DELIMITER THEN DO; OK = l; RETURN: END; IF CHAR( INP)=DELIMITER THEN GO TO EMERGENCY_END; ()K«0; END STRING; IDENT: PROCEDURE; /* THIS ROUTINE SCANS IDENTIFIERS */ TEMPIDENT=« •; RIDBLANKS: IF CHAR(INP)=« • THEN DO; INP=INP+l; GO TO RIDBLANKS; END; IF CHAR( INP)>=«A« 6 CHARMNPJO'Z' THEN GO TO FIRST_FUUND; ELSE DO; ok=o; RETURN; END; FIRST_FOUND: TEMPIDENT*TEMPI DENT I |CHAR( INP) ; INP«INP+1; IF CHARI INP)>««A« £ CHAR ( INP )<■• 9» THEN GO TO FIRST FOUND; IF CHAR( INP)»«_» THEN GO TO FIRST_FOUND; ELSE DO; 52 0K=1 ; RETURN; END IDENT; CONST: PROCEDURE; /* THIS ROUTINE SCANS UNSIGNED INTEGERS */ TtiiPCONST=« • ; RIOMORE: IF CHAR(INP)=» ' THEN DO; INP=INP+1; GO TO RIDMORE: END; uk=o; DIGITEOUND: IF CHAR ( INP ) >= • 0» £ CHAR ( I NP )<= • 9 » THEN DU; TEMPCONST=TEMPCONST| I CHAR ( INP); INP=INP+1 ; OK = l; GO TO DIGITFOUND; END CONST; NEWSTR: PROCEDURE; /* THIS ROUTINE SCANS LITERAL STRINGS */ TEMPSTRING=« • ; RD_RLK: IF CHAR(INP)=« • THEN DO; INP = INP+l; GO TO KO_RLK; HNI); IF CHAR( INP)-.a , « • • THEN UU; OK = 0; RETURN; END; ELSE INP = INP+l; TEST_APOST: IF CHAR ( INP ) = ••• ' THEN DO; IF CHAR( INP+l )=•'• ' THEN DO: TEMPSTRING=TEMPSTkING| ICHAR ( INP) ; INP=INP+2; GO TO TEST_APUST; EMU; ELSE DO; INP=INP+1 ; OK=l; RETURN; END: END; ELSE DO; IF CHAR( INP)=DELIMITER THEN GO TO EMERGENCY_END: TEMPSTRING=TEMPSTRING| ICHAH ( INP) ; INP=INP+l; GO TO TEST_APOST; END NEWSTR; /* */ /* MISC. ROUTINES FOR COMPILER */ /* #/ POST: PROC FIXEDOltO) BINARY; /* COMPUTES CURRENT TIME IN MS. */ DCL X FIXED(31»0) BINARY, DUMMY CHAR (9) t 1 TIMEX, 2 (H,M,S) CHAR(2), 2 T CHARI3) ; DUMMY=TIME; GET STRING(DUMMY) EDI T ( T I MEX ) ( 3 A(2),A(3)); X»( (H*60+M)*60+S)*1000+T; RETURN(X) ; END POST; EMERGENCY_END: PUT PAGE LISTMEND OF INPUT STRING REACHED'); END PARSCAN; CARDCOl: PROC FIXED BINARY; /* COMPUTES CARD COL ACCDG. TO INP */ DCL (X,Y) FIXED BINARY, (RMARG,LMARG,INP) FIXED BINARY EXTERNAL; Y»RMARG-LMARG+l; X«MUD( INP,Y)+(LMARG-1) ; RETURN(X); 53 END CARDCOL; CARONUM: PROC FIXED BINARY; /* COMPUTES CARD NUMBER ACCDG TO INP */ DCL X FIXED BINARY, (RMARCLMARG, INP) FIXED BINARY FXTFRNAL; X= INP/ ( RMARG-LMARG+1 )+l t RETURN(X) ; END CARONUM; & rr UJ ec X Z o UJ oo h Mf\ ^ m ,of»-floor>o<-''i , tr»»0h-oo0''0»-*<\j _l._.^_MM.-i--«M.- v - 00 • M oo v •• n a. v n <_> o ~ x i- V z - < A X • UJ «$• tS) •• » A V UJ - rw x v A < - z z O I— A Ml/) _| »- 00 A O o I |00 ♦ uj a I — . (A A a in •• Uj f-i — *t u. • C — A * a 0 — uj A Vt v (- « — - V * # ♦ rsj A Z ■» V — o m CU •-< « «* - A l/> # V II II II II II II II II II II II II II (A o u < X > 00 3 a A ec oo < o o UJ I co O cd a mm O V V A UJ X A < _J z 1 A O A H- CO A 00 U. 00 X A V O0 UJ UJ v Z K < Q ►- oo o Ml _l A 1 1 1 *— -J o or Q U -i A (- Ml A IUJ UJ M < X U CD Z UJ t-4 OkZ < 3 ** A O O0 LL O Z «-» I- O A A A 00 X MJ < tart UJ < X z o 00 D U. oO UJ t- as 3COX0C >• tt I C UJ O »- O X O X uj UJ oo a. JUIQLn < 0. X w t/lk V V V V V V V V V V V V V 55 APPENDIX C SEMANTIC ACTION PROCEDURE FOR 'BUILD' MDDULE MTl /*• TAOS SEMANTIC ACTION .'JUTINfc PROCCOURE */ I ACT: f* TAC'JS S?*A*T!C ACTION ROUTINfc PRUCtDUI'F '/ P^JCfc JUKt:! WHICM.'VCTI )N_NUMI Rt-CJRStVE; DCL WHICH.ACTIUN.NUM FIXED BINARY, 1 liR I J'iC- HXTERNAL, 2 v.JC QrtJlTII A FIXFO iilNARY, / */ Z 1F4PCJNST VARYING CHARACTER! 11 I , /' «<*l<> "• / 2 TtMPITENT VARYING CHARACTER ( 32 1 , /« «< */ 2 TFMPSTRINi VARYING CHARACT £R( 100 1 , /<• *<*S> */ CHARM2767) CMARIl) EXTFRNAL /**SOURCl PROG hUFFp*/ CO'OR >LLED, >'■ INDEXED hY */ IhP FIXIO BINARY fcXTtRNAL, /* =SCANNER POINTED */ UK. FIXL) JIMARY bXTERNAL, / * C A H S£T*0 IN <»XX>«/ C4RJNUM FXTERNAL tNTRY /" */ -<£TURNSIFlXKu 3I4ARY), /* = CA<>0 \iUMBK Jh MP*/ LAXDCOL EXTERNAL tNTRY /* »/ RtTURNSIFlXEU 6INARY), /*=CA^I) CLiL OF MP * / ACT! Jn!0:2U0> LAoEL; /* */ i i 1 Til ACTION(WHICH_ACTIU.M_NUM) ; /*._„ _ -_-- — ,/ /* UStR-rfRITTEN OCLNS AND PR'.ICS FOLLOW Ht RE - / /* */ f I: Mi) I 23 PUT skipu) ; H PUT SKIP(l) UUITUCHRIJ) Of) J=l TO PT_CHR I ) ((PT_CHK (Mill; 30 IF G0CLNJIT!ON»0 THEN 1)0; M CONTROL«'NOGO' ; ii PUT SKIPI3) LISTM"" INPUT SYNTAX IN KRR.1R *«■*'); 3* END; 35 £LSc 00; C0NTR0L«'0KG0' ; »7 PUT SKIPI3I LI ST (•*<■«• I NP J T SYNTAX OK **'•); ii h" NO ; 39 OPEN FILEISYNXTAdl ; 4J PUT FILcISYNXTAS I L I ST ( PT_NAME , FREE, PT.CHK , C1MT RHL , (TAi3LrS(l) 01 1=1 TCI PT.NAHC), (LISTS! I I DO I«l TJ Ekt-.t- ) , PUNCH_JECK: OPEN FILE(SYSPNCH) ; -./ PUT FILEISYSPNCH ) L I ST ( PT.NAME, FREE , PT.CHR .CONTROL , (TABLFSm DO I«l TO PT.NAMEI, (LISTSII ) 00 1 = 1 TO FREE ). »: NAME! PT_NAMt ) "TEMPI DENT ; 65 LIST(PT_NAME»=0; 66 LOC_NAME_TAb«PT_NAME; 67 PT_NAME«PT_NAME*15 63 AL«EADY_IN : LAST_ALT"0; 69 TEMPCONST = "; /* ZERO SEMANTIC ACTION NJM */ 70 RETURN; 71 ACTI0NI3): IF LAST_ALT-. = THEN R IGHT ( L AST_ALT ) = FREE ; 73 IF FIRST_ALT « THEN FIRST.ALT = FREE; 75 RlGHT(FREE»*0; 76 LAST_ALT«FREE; 77 FREE"FREE*l$ 78 return; 79 acti0n<4): if tempconst-" thfn center ( fr ee > «0{ 81 else cent£r(free»»tempconst: 57 ACT: /* TACOS SEMANTIC ACTION ROUTINE PROCEDURE */ 82 IF LOC_NAME_TAB * THEN RI GHT (t AST_POSS ) "FREE; 8* ELSE DO; LIST(LOC_NAME_TAB)-FREE; 86 LOC_NAME_TAB=0; END; 88 LAST_POSS*FREE; 89 LAST.ALT-O; 90 LEFT; 130 return; 131 actijni12»: left(free)*tempconst; 132 tempconst*'*; 133 center!free)*8; 134 RETURN; 135 ACTION! 131: CENTERIFREEI-4; GO TO FINIS; 137 ACTI0NU4): CENTER(FREE) *5; GO TO FINIS; 139 ACTI0N(15»: CENTERIFREE)*7; 140 FINIS: LEFT(FREE)*0; 141 return; 142 ACTION! 161: GOCONOITION-0; 143 PUT SKIPI1) EDIT!**** SYNTAX ERROR IN SEO NJMBER CARDNUM,' NEAR COLUMN ' ,C AROCOL, • . • ) (A,F(4).A,F(2I ,A); 144 RETURN; 145 ACTION! 17): FIRST*16; 1*6 CALL ACT(FIRST); 58 n r ■ ; i n : )CL LINE riXI-D BINARY; LINI>INP/(RMAR0-L«ARG*1 l+li RE i-l : I J?*(KMARG-LMARG+t l*L INF ; )f'iij : KcTUPN i; LrFT< FR^C)=PT_Cri«; J'l 1 = 1 TO LENGTHITEMPSTRINGI; CHR(PT_CHR)=SUbSTR(TEMPSTRlNG, 1,1) PT_CHR = PT_CHPM ; END; CHRIPT _CHR)=DELIMITER; PT_CHR = PT_CHR-H: ACT: /* TAC.IS SEMANTIC ACTION RJUTlKE PRUCEOUKE */ 1.7 l-i 1-- 1 il IS* 1NP=INP-1; BO TI DEHLANK; l.M)S IV. l'.a IS -J I'.l l<.2 1"3 1j- lob AClI'm?0>: loS lo7 1 7J 171 172 CENTER(FREEI*6; 173 return; 174 ACTIJN(2l): RI 'JHT (L AST.PUSS) =0 ; HRST_ALT = 0; KF.TJFN; 177 ACTUNI22): FR^F POINTERS; 178 RETURN; 17SI ACTI0>I(24): PUT SKlPlll EDIT (i««« ILLEGAL LEFT-HAND-SIDE OK '•:: = •« IN St Q ', CARDNUM,': RECOVERY SUCCFSSFUL.' ) ( A, I- < 3 ) , A ) ; lou RETURN; Ul ACTIONI25I: INP-INP-3; 192 ok = o; 1H3 RETURN; 184 ACT I 3NI ill: LMA'=LMARX & LMAkX>=l THCN RETURN! Iu9 ELSE jo; 190 LMARX=1; RMARX=72; 192 PUT SKIP tDIT(«**" ERROR IN tCMARG CimTRjL CARD. • • DEFAULT COLS ( 1 , 72 > ASSUMED. • ) ( A , A I ; 193 ENO; 194 RETURN; 195 ACTIUN(33): IF CHAR < I NP»=» <• THEN DO; 197 0K=0; /* FORCE EXIT FROM « * / 196 RETURN; END; 200 P'JT SKIP EDIT (•«■*«■ 1 CONT RUL CAPO EhRdP IN SEO ■ , CARJNUM,'. DEFAULT PARAMETtR(S) USED. ' ) ( A, F ( 3 I , A I ; 201 INP»INP*l; 202 ENCORE: IF CHAR ( I NP) -.= • < ' S. CHAR ( INP ) -.= ' % ' THEN DO; 204 INP-INP*l; GO TO ENCORE; END; 207 RETURN; 20d ACTI0NO5): CTI TLE-TEMPSTRI NG; RETURN; 210 ACT10NI93I: ALLOCATE POINTERS; 211 RETURN; 212 ACTI)N(99|: Nl»600; N2*4000; N3=4000; 215 ALLOCATE TABLES; 216 ALLOCATE LISTS; 217 ALL1CATE CHR ; 59 ACT! /* TACOS SEMANTIC ACTION ROUTINE PROCEDURE V 218 FIRST-98; 21? CALL ACTIF1RST); 220 RETURN) 221 CKLSFT: P*OC; /* THIS PROCEDURE CHECKS FCM LEFT-RECURSION IN SYMXTAQ*/ 222 JCL IJONEYETI PT.NAME) BITUI, TUP FIXED BINARY INITIO), STACK(IOO) FIXED BINARY INiri(lJO)O). SOFAR ENTRY RE TURNSIF IXED BINARYI, PCN FIXED BINARY: 223 DONEYET-'O'B; 224 DO PCN«1 TO PT.NAHE; 225 CALL LFT.RECIPCN) ; 226 fcNO; 227 EXIT: RETURN; 228 LFT.REC: PROC(I) RECURSIVE; 229 OCL 1 SEARCH, 2 (L,C.R) FIXED BINARY, I FIXED BINARY; 230 IF UONEYETm^l'B THEN DO; 232 RETURN; 233 ENOS 23* IJONEYETI D-'l'B; 235 SEARCH»LISTS(LISTII»); 236 CALL PUSH! I ); 237 RECYCLE: IF LISTS! L) .CENTER<-3 THEN 238 IF SOFAR(LEFT(L))"0 THEN CALL LFT.RECUEFT ( L ) ) ; 240 IF R»0 THEN DO; 242 CALL POP(I); 243 RETURN; 244 END; 245 SEARCH-LISTSIRI ; 246 GU TO RECYCLE; 247 END LFT.REC; 248 PUSH: PRiJCdll 249 DCL I FIXED BINARY; 250 TOP»TOP*l; 251 IF T0P>100 THEN DO; 253 PUT SKIP LIST! 'STACK OVERFLOW); 254 PUT SKIP LIST(STACK); 255 GC TO EXIT; 256 END; 257 STACK! TOP»"I; 258 return; 259 END PUSH; 260 POP: PROCd); 261 DCL I FIXED BINARY; 262 IF TOP-0 THEN DO; 264 PUT SKIP LIST! 'STACK UNDERFLOW*); 265 GO TO EXIT) 266 END; 267 STACK! TOP»«0; 268 TOP-TOP-lI 269 RETURN; 270 END POP; 271 SOFAR: PROCU) FIXED BINARY; 272 DCL (I.J.K.L) FIXED BINARY; 273 FINUt 00 J«TOP TO 1 BY -I ; 60 UJ •^ ^ • • • r- < z X » ** ^ , r-« z • ^4 o 00 •* MM 00 X CO < ^» >s Of -J «» * o „* UJ LU UJ s* CL a: CO *•• 3 < *: O »- a: o UJ LL X < o UJ •• CL t- O -J ►- co an rj x — » a * o o UJ • • * 3 z UJ o * o c < z Q m *- ae z tmt »# X «•» K z 9*- H o t- ■• 3 UJ o co o m a H- a X il >— • »— o M Of K z _J Z H t- Q o UJ *-* UJ z >— • «— • X Q a #• c II f- H- UJ *- 0. >- «— 1 «M »— 1 M> 3 >- -5 Q 0- a. a. >-* * a u U. => G II o •— « z o z a >—* a O -J o u. a •• z < u. • * 3 •• 1- LU z ►=1 <• Q O H- a: U- UJ H- Z ii UJ < UJ (-0 a UJ _i aC u. O -1 00 ♦ • CO a a ►- u D O o < o z z V- O LU UJ ♦ ^^Oh»ooO'-"fvjf r tN*'u>goh-ooa s o«-"fNJ <\Jl u z UJ a ai oo • ft _ m J _ L - o o .. UJ —1 5 ♦ K Z eg A H> • ft o •—4 M * g * Z A UJ * l« *— • mm z >- u. ■ S # > f\i A C5 — » -J V Z O i— A t5 00 (^ in UJ A V mm * < « o 00 >-« V *mof n * CO A X o # s M V a> » -* V >■* m UJ - z V « V — * A m ^ 00 •t t/> • •• • V • £ z A A V r- mm rg oo « A V 4m o c Z A — i — — * •t «fc o LiJ ■ — iM mmm _ « N— QC « i/i tt - * rsi — . a. e» z 00 -* •* g # 00 < V * •♦ V « - V A < • * « — 1 p * a: O V • • — V it z z * « K UJ | • o — IT A • • a 1 «-* A • —». E? mm > ml » m z eg » — mm M 00 -1 * • m © M O • ii UJ A * A m m 1- 00 A C • %^ A w oc * ■ • U, mm M II «- a < u. in • * Z •• J Z H H A — UJ • ♦ ft < -J UJ X • » — A •• ♦ s o § • >4 - XS O V • A V O O V f* u. V SB M ro UJ UJ — — — >• 1 loo * UJ eg - s OO <_> • * JZAK • " *- + UJ o I Q 1-4 Of V u • A •• *m* »>• oo UJ mi • V <* — o UJ X. QC w- • — a • " -J f*> < o < f*» m fvj > UJ < A H 1- UJ • A • ••« 1KOZ A IfN ■ mm A m* m -J r z a o v a mm m, CO A X UJ *-4 Z * o z * — a * • 3 z * ^— W Z a ao z # — .-4 # - A a. A • V Q A UJ V • 00 UJ V X a: V -o •* V A UO Z » • a oo » m II to h- »— « UJ A m - — ♦ o e> * • *» a i w V .. o i»i • V H • "" • •» - * V o V • ■ a. _i V V • •t a V V V * # ♦ • » V at o UJ -j mm 0. mi II ii n ii II H ii II li II II ii ii II II * • t •• •t •• z A o 00 UJ u o X < A mi u < Z A O -i H A 1 A 1- CO •N *m O A 00 U. 00 Z 3 o » A QC A V oo uj uj v a « « 00 < Z K < Q ►- 00 • * u o o >M mi A I 1 1 to m4 V UJ 1 •— « -J UOTQUJ o m0 II Ou 4 A K N4 A lUi UJ « < o N UJ 00 x a eo z UJ -OhZ < (9 -1 l«r < 3 •-< A a oo u» a z •- H- ee Vm u. H- t- O A AAi^ mm < mm UJ < Z < M z z z a 00 O UJ oo UJ h- ec Q eo z c£ * z H- eo a v <* z o ►- u r o z uj uj o * o u M U oo a. juoa kX < a. Z M o0 H Z « mi «* V V V V V V V V V V V V V V V UJ 62 OUTPUT SYNTAX TA"LE **•"* is IBNF.SPECS 4 2: CQNTROL.CARD 15 3: SYNTAX 30 4: PRODUCTION 39 5: (GENERATED) 5 26 6: LHS 59 7: EQU 61 8: DEF 75 9: (GENERATED) 9 35 10: (GENERATED) 10 41 11: (GENERATED) 11 51 12: (GENERATED) 12 63 13: POSSIBILITY 80 14: (GENERATED) 14 73 15: (GENERATEO) 15 77 16: ITEM 85 17: ACTION 96 18: PHRASE_CLASS_ .NAME 98 19: MODIFIER 100 20: IMBED0EO_DEF 115 21i SEMANTIC.TES1 113 22: TERMINAL_SYMBOL 120 23: (GENERATED) 23 108 24: (GENERATED) 24 112 1: 99 8 2 2: 2 3 3 3: 3 4: 1 5: 1 6 6 6: 3 6 7 7: 9 6 8 8: 11 6 9 9: 5 10 10: 31 8 11 Hi 13 6 12 12: 5 13 13: 32 8 14 14: 15 6 15: 5 21 16: 17 6 17 17: 19 6 18 18: 26 6 19 19: 7 20 20: 35 8 21: 16 23 22: 33 8 23: 11 24: 4 2 29 25: 28 6 26: 25 1 28 27: 32 6 28: 27 23 29: 5 30: 24 31: 6 32 32: 7 33 33: 8 38 34s 38 6 35: 34 37 36 : 17 8 37: 36 38 : 9 39: 31 21 45 40 40 6 41 : 40 43 42 44 6 43: 42 44: 10 45 44 25 47 46 : 17 8 47 : 46 48 48 6 49 49 ; 4 50 50 : 50 6 51 48 54 52 : 4 53 53 : 52 6 54 : 52 24 57 55 : 54 6 56 56 : 4 57 : 55 24 58 : 11 59 : 58 2 60 : 56 6 61 l 60 69 62 \ 60 6 63 : 62 65 64 : 63 6 65 : 64 67 66 : 66 6 67. : 66 68. i 12 69: : 68 24 70 13 74 71: : 68 6 72 72: : 13 73. 71 74 14 3 75 : 70 76 : 16 77: 76 3 78: : 15 2 79 79: 17 1 80: ! 78 4 81: 70 6 82 82: 18 83 83: 72 6 84 84: 19 85: 81 87 86: 20 87: 86 91 88: 74 6 89 89: 21 90 90: 76 6 91: 88 93 92: 22 63 6U A * V A X « V a V A V A V II II V A A V (J5 u) QC O 2* U.1 I 2 LU C5 3iTOOOO©r\JO>*O>0OOOOOnOO**OOr-OOOr\l© > tO.0OO O t> OOO — ( '-' -< »■« £aO'C?'oiri(Mr'Omr*-o .j MD5^OI»OaN^ + m*lANMne0ffHf9iffHO«l»lfffflH(*(l*lOlfl CT> P- 3" (TCOC7 l ceOOOOfflOOOO(NJ OOHIMHOO 1 Hfrt f NON (M — II o < fO'(J'a»ffff'OOOOOOOOOOH-i-- to 3 a a o o z o a. x o o LL o o z 65 APPENDIX E TACOS JOB CONTROL LANGUAGE AND FORM OF SEMANTIC SPECIFICATION JCL FOR COMPILER GENERATION //JOBLIB OD DSN=USER.P2138. TACOS. MODULES, DISP=(OLD, PASS), UNIT=DISK, X // VOL=SER=UIDCSfe // EXEC PGM=BUILD //SYSPNCH DO DUMMY //INTAB DD DSN=USER. P2138. TACOS. TRLTAB,DISP=(OLD, KEEP) ,UNIT=DISK, X // V0L=SER=UIDCS6 //SYNXTAR DD DSN=USER .P2138 .TACOS. SYNTAB, VOL=SER=UI DCS6, UNI T=DISK , X // DISP=(NEW, KEEP), SPACE=(TRK, (1,1) ), X // l)CB=(RLKSIZF = 3600,LRECL = 80,RECFM = FB) //SYSPRINT DD SYSOUT=A //SYSIN DD * < IBNF SYNTAX SPECIFICATION OF SOURCE LANGUAGE > /* // EXEC PGM=IEBGENER //SYSPRINT DD SYSOUT=A //SYSUT2 DD UNIT=DRUM,DSN=£ACTIONS,DISP=(NEW,PASS) , X // DCB=(BLKSIZE=80),SPACE*(TRK,(5,5) ) //SYSIN DD DUMMY //SYSUT1 DD * < ACTION ROUTINES SPECIFIED IN IBNF > /* // EXEC PL1 //PL1. SYSIN DD UNIT=DISK,VOL=SER«UIDCS6,DSN=USER.P213B. TACOS. ACTPROC, X // DISP=(HLD,KEEP) // DD DSN=£ACTIDNS, DISP«(OLD, DELETE) ,UNIT=DRUM // EXEC LKEDPL1 //LKFD.SYSLMOD DD DSN=USER .P2138 .GENERATD.COMP ILER , VOL= SER=UI DCS6, X // DISP«(NEW,KEEP),SPACE=(TRK, (20,10,i>) ) //LKED. SYSIN DD * INCLUDE JOBLIR(PARSCAN) NAME MYCOMPLR(R) /* 66 * CO w •> o —. II CO H- CO — « < Z a. 3 •> » Q 4 -1 CO O U w o II •— 1 a 3 to II M or A o LU z •> CO or o ac II LU >— • LU -) -J h- -j O l-N 3 M > a. o a. » £ LU X eo O X o < o LU u h- • *: Z o or O CO > h- LU >- M CO _) < o • H- >— • BC IS co 3 O. LU *~ O a. £ z »— # u z o LU Z < M O CD 3 K • *■ • *■*.. LU QC 00 v.: 00 a O o CO (/> m UJ < LU «— 1 U ac >-* lu 3 r\j O _J CM * O — 1 a r-< o. a ► z o • 3 s: • a < ~5 or II o or -i _) LU or. u LU o CO LU > CO w LU 3 co £ 3 II O II II ii II a. oc 2JI Z to 3 CO O O CO ►— 1 O a > a a O * CO o u a o V o lu a o X LU 00 ►■« CO z -J < ^4 00 K CO o Z >■ -5 M CO to LU -J CO < 1*4 or < > a. z •V "V. *v V. «s. CO 5J z LU < LU z z 1- ►-» LU H- h- CO HM 3 or O CC ■x ac a LL > Z • 3 3 CO t , 3 ►— i Z * *~. CO I— C A •> o II »•— i UJ LU < •• K- Z •» A z •• < M 2 •— * ••• LU OC H- & K X < 3 V 3 h- A -J C • O LL o or • » «* or » LU Z LU H- _ • fl» o CO a z u 2 z z ►■"» o < □ o or TL NX ►— 4 »— i — 3 CC z V K o h» 1— 1— a M _l o z U U LU u. a < LU < < a OS-»V » i V- - _j LU < MH a u. LU ►"- t z o II LU LU • • O a CO CO LU z 1— 1 h- 3 O or z o M U < V • • A LU z »»« H- 3 O CC z o o < V 67 LIST OF REFERENCES [l] R. M. Graham, "Bounded Context Translation," Proc. Spring Joint Computer Conference , Spartan Books, Baltimore, Md., vol. 25, (1964), pp. 17-29. [2] P. Naur, et al. , "Revised Report on the Algorithmic Language ALGOL 60, " Comm. ACM , 6, (Jan., 1963), pp. 1-17. [3] R. Wo Floyd, "A Descriptive Language for Symbol Manipulation," J. ACM , 8, (Oct., I961), pp. 579-584. [k] R. A. Brooker and D. Morris, "An Assembly Program for a Phrase Structure Language," Computing Journal , 3., (i960), p. l68. [5] R. A. Brooker and D. Morris, "Some Proposals for the Realization of a Certain Assembly Program," Computing Journal , 2., (1961), p. 220. [6] R. A. Brooker and D. Morris, "A General Translation Program for Phrase Structure Languages," J. ACM , £, (Jan., 1962), pp. 1-10. [7] E. T. Irons, "A Syntax Directed Compiler for ALGOL 60," Comm. ACM , h, (Jan., 1961), pp. 51-55. [8] S. Warshall and R. M. Shapiro, "A General-Purpose Table-Driven Compiler," Proc. Spring Joint Computer Conference , (1964), pp. 59-^5 • [9] T. E. Cheatham, Jr. and K. Sattley, "Syntax -Directed Compiling," Proc. Spring Joint Computer Conference , (196U), pp. 31-57* [10] C. W. Gear, (unpublished CS306 lecture notes, Spring, 1968), Department of Computer Science, University of Illinois, Urbana. Til] N. Chomsky, "Formal Properties of Grammars," Handbook of Mathematical Psychology , vol. 2 , John Wiley and Sons, . Inc. , (1963), pp. 323-418. [12] H. R. G. Trout, "A BNF-Like Language for the Description of Syntax Directed Compilers," M. S. Thesis, Report No. 300, Department of Computer Science, University of Illinois, Urbana, (1969). UlMUliab^XJ 1 -LEJJ Security Classification DOCUMENT CONTROL DATA - R I (Security clmemlHemtlon el till; body ol mbctrmct mnd indmrnhj awwHaflaw Hurt b» antmvrf crlMm tha orerell report It elm filled ORIGINATING ACTIVITY (Corporal* author) Department of Computer Science University of Illinois Urbana, Illinois 6l801 •part »CPO«T ISCUWITY CLA*SI f IC A TIO^ UNCLASSIFIED 2b. BROUP REPORT TITLE TACOS: A TABLE DRIVEN COMPILER -COMPILER SYSTEM descriptive NOTIl (Type of report end inelueire dmtem) Researc h Report AUTHOR(S) (Firm! nmme, middle Inltimt, iaar iitnr) John Lawrence Gaffney, Jr. WFOKT DAT! June, 1969 fm, TOTAL NO. OF PACK* 67 7b. NO. Of MFI 12 a. CONTRACT OR GRANT NO. 1+6-26-15-305 t>. PROJECT NO. USAF 30(602 )*aMr •a. ORIGINATOR'S REPORT NUMBER'S) DCS Report No. 325 *b. OTHER REPORT NO(SI (Any other number* (bar mmy be meelgned thit report) 3. DISTRIBUTION STATEMENT Qualified requesters may obtain copies of this report from DCS. SUPPLEMENTARY NOTES NONE 12. SPONSORING MILITARY ACTIVITY Rome Air Development Center Griffiss Air Force Base Rome. New York l?hhO ABSTRACT This thesis is a description of and specification for TACOS, an interpretive compiler -compiler system employing a recursive-descent parsing algorithm. In its current implementation in PL/l on the IBM SYSTEM/360, a modified BNF grammar and PL/l semantic routines provide the specifications for compiler generation. The author has intended that TACOS be a general purpose compiler-generation system independent of implementation. To this end, the metalanguage and parsing algorithm are presented from a specification rather than an implementation point of view. In contrast, the semantics are regarded as too strongly tied to the implementation language to adhere to a general specification, and are, therefore, discussed in relation to the current PL/l implementation.- d :r.,i473 UNCTASSTFTim Security Classi ification UNCLASSIFIED Security Classification key wo RDS BNF (Backus Naur Form) syntax-directed compiler table driven parser compiler -compiler LINK C UNCLASSIFIED Security Classification