LIBRA R.Y OF THL U N IVLR.SITY OF ILLINOIS 510.84 K6r no- 301-307 cop. 2 The person charging this material is re- sponsible for its return on or before the Latest Date stamped below. Theft, mutilation, and underlining of books are reasons for disciplinary action and may result in dismissal from the University. UNIVERSITY OF .LL.NO^SUMAR^ATJ^^ DEC 1 3 ttff APR 6 FEB 1 A.HI. L161— O-1096 Digitized by the Internet Archive in 2013 http://archive.org/details/generationofdete304beal ■ft tr Report No. 30^ "fill tli THE GENERATION OF A DETERMINISTIC PARSING ALGORITHM fey Alan James Beals January 6, 1969 The person charging this material is re- sponsible for its return on or before the Latest Date stamped below. Theft, mutilation, and underlining of books are reasons for disciplinary action and may result in dismissal from the University. University of Illinois Library im ■ m MA' L2 Report No. 30U THE GENERATION OF A DETERMINISTIC PARSING ALGORITHM* by Alan James Beals January 6, 1969 Department of Computer Science University of Illinois Urbana, Illinois 6l801 * This work was supported in part by the Advanced Research Projects Agency as administered by the Rome Air Development Center under Contract No. US AT 30(602^ UlUU and submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science, February, 1969. Ill ACKNOWLEDGEMENT The author gratefully acknowledges the aid and encourage- ment of Professor Robert S. Northcote in "both the development of the algorithm herein described and the content of the document. Thanks are also due to Mr. Franklin DeRemer and Mr. Jacques LaFrance for their help in developing the algorithm. The author would like to express his appreciation of the Department of Computer Science of the University of Illinois and of the ILLTAC IV project for their financial support. Special thanks also to Mrs. Mildred Pape for her time and effort in typing the manuscript. IV ABSTRACT This paper describes an algorithm for the conversion of a grammar in the form of a set of BNF productions into a deterministic parsing algorithm as described by a set of modified Floyd productions. The algorithm is extended in such a way that it may easily become a part of a complete translator writing system and make use of the information available in the semantic part of such a system. The paper also includes a discussion of the implementation of the extended algorithm and describes potential related research. TABLE OF CONTENTS Page 1. INTRODUCTION 1 2. THE ALGORITHM 3 2.1 Notation 3 2.2 Label Determination ..... 6 2 . 3 Descriptor Set Generation 6 2.4 Descriptor t o FPL Production Mapping 7 2 • 5 Preclusion and Error Production 8 2.6 Summary 9 3- CONTEXTUAL ANALYSIS 10 3-1 Bar Symbols 11 3.2 Bar Symbol Removal 12 3-3 Nta and Ntb Groups 13 3-4 Ntb. (tt, n) Groups ik 3 • 5 Summary l6 h. COMBINED GROUPS 19 4.1 Ct (m) Groups 19 4.2 ch (m) Groups 21 4-3 CNtb (m) Groups 23 4.4 N* (m) as an Element of (k) 25 4-5 Initial Revision of BNF Grammar 26 4.6 Summary 23 vi Page 5. SEMANTICS AND EMPTY BNF PRODUCTIONS 29 5«1 Semantic Routines 29 5«2 Semantic Conditions 31 5-3 Empty BNF Productions 3k 6. IMPLEMENTATION 36 6.1 BNF Production Storage 36 6.2 Grammar Revisions 38 6 . 3 Label Determination kO 6.4 Descriptor Set Generation kl 6.5 Descriptor Set to FPL Production Mapping k2 6.6 Preclusion Elimination ^3 6.7 Nta, Ntb (tt, n), CNtb (m) Groups kQ 7- OPTIMIZATION AND ERROR RECOVERY 51 7«1 Stack Comparisons 51 7.2 Variable Lookahead Length 55 7 • 3 Nonterminal Symbol Expansion 56 7«4 Pseudo Orders 56 7 • 5 Error Recovery 57 8. CONCLUSION 60 8.1 Precedence Systems 60 8.2 Right Bounded Context . 62 8.3 Problems 6k 8.4 Potential Improvements 6k 8.5 Research Extensions ..... 66 Vll Page APPENDICES 67 A. THE PSEUDO ORDERS 67 B. OUTPUT OF THE BNE TO FPL CONVERSION PROGRAM ON THE SAMPLE LANGUAGE DEMALGOL 70 LIST OF REFERENCES 109 1 . INTRODUCTION The algorithm which converts Backus Naur Form (BNF) pro- ductions to Floyd Production Language (FPL) productions and its implementation, which are described here, grew from the need for a procedure oriented language for the ILLIAC IV computer "being designed by and built for the University of Illinois. Since the ILLIAC IV is a completely new concept in digital computers, the details of a language best suited to it and to its applications are relatively unknown. For this reason, the use of a compiler-compiler, or translator writing system (TWS) is preferable to writing a specific hard code program for the translation of a fixed language. It is anticipated that the use of a TWS will make the task of changing the language easier for the language designers whenever the need for such changes should appear. It is also hoped that the avail- ability of a TWS will encourage users of the ILLIAC IV to modify this base language and create new languages more nearly suited to specific applications. The goals of this TWS are the same as those of any TWS; namely, the building of a translator which is fast, occupies as little space as possible, and is capable of generating efficient code. A TWS must also be able to accept the grammars of a large class of languages. In addition, since we hope to encourage the use of the TWS by applications people who are often not familiar with language specifications, it is important that the language specifi- cation input to the TWS be in as simple a form as possible. Consideration was given to several different parsing irithms as potential bases of the system. Among these were the algorithms of McKeeman [l], Wirth and Weber [2], Ingerman [3], Erooker and Morris [U], and Trout [5]* Without precluding subsequent incorporation of some of the other methods, it was decided that some version of the production language (FPL) of Floyd [6], Evans [7], and Feldman [8] was potentially most likely to meet the first of the above criteria. The input to such a system is, however, more complex than was desirable. Chomsky [9] type two (context free) productions were chosen as the most widely known and easily understood method of language syntax specification. The most commonly used notation for this type of production is the Backus-Naur Form (BNF). Algorithms for conversion of BNF to FPL by Earley [10] and DeRemer[ll] were considered and that of DeRemer chosen as the more promising. This thesis is a description of an extension of that algorithm and the subalgorithms used in its implementation. Brief ussions of the properties of both the algorithm and its imple- mentation are included where appropriate. 2. THE ALGORITHM The algorithm described in this chapter is that which was proposed by F. L. DeRemer [ll] at the University of Illinois. Subsequent chapters will describe the extensions necessary for its practical implementation. The notation used here will also be used in the following chapters. 2.1 Notation We consider a language L to be defined by a phrase structure grammar G = (V T , V N , S, P) where V = the set of terminal symbols of L which will be represented by lower case Latin letters, V = a set of nonterminal symbols which will be repre- sented by upper case Latin letters, S e V is the objective symbol P = a numbered set of rules defining a language L specified by the user, together with Z -> _[_ S Strings of symbols will be denoted by Greek letters, "the rules defining L will be Chomsky type two productions and will be written in the form: N - a Symbol X (X e V = V U V ) is a head symbol of N if N = X, or N -* X . . . , or there exists a subset of P of the form N - N . • • N l 7 N 2 ' * ' N -» X . . • n > 1 n — The string to be parsed is assumed to be of the form i*l where a € L(G), the set of strings defined by G, or a e V„, , the set of all strings over V . The Chomsky type two productions will be referred to simply as productions, or for ease of notation, BNF productions. The FPL statements will be referred to as productions (when the context makes clear which type of production is meant) or FPL productions. An FPL production will be written in the form: Ll:pa|y-*N|*Ii The label LI may or may not be present. The string (3 Ot is to be com- pared to the top of a stack. If p a is not identical to the top of the stack, processing is transferred to the beginning of the next production in sequence. If 3 a is identical to the top of the stack, the terminal symbol string 7 is compared to the next symbols in the input stream. The lack of a match is treated in the same way as a failure to match the stack. Both 3 and 7 may be empty as will be further explained later. The -> N indicates that a at the top of the stack is to be replaced by the single symbol N. This field may be empty. The * may or may not appear and, if present, indicates that the next input symbol is to be scanned into the top of the stack. The L is the label of the next production to be used. A descriptor (tt, n) associates a FPL production with a BNF production in the sense that the a in the FPL production corres- ponds to the first n > 1 symbols on the right hand side (RHS) of BNF production tt. 2.2 label Determination The FPL productions are grouped and a label is attached to the first production of each group. The labels are thus associated with an entire group rather than a single production. The first step of the algorithm is to determine the labels of all the FPL groups. Let X (tt) be the n-th symbol on the RHS of BNF production tt. There are three rules for determining which labels (groups) must be created. For each N € V : 2.2.1 label Nh exists if 3 tt, n : N = X (tt), n > 1 ' n For each N e V^: 2.2.2 label Nt exists if 3 TT, n : N = X (tt), n > For each t e V : 2.2-3 label t (tt, n) exists for each ir , n : t = X (tt), n > 1 2.3 Descriptor Set Generation The next step is to determine the set of FPL descriptors, D = ((tt., n.) i = 1, 2, . . . , k} for each FPL group which must exist. The group labels are descriptive in the sense that the three types of labels generate descriptor sets in the following three .'erent ways. D = {(tt, 1) | X (tt) e V and the left side 2.3.1 of ir is a head symbol of N] D Nt = {(tt, n) | N = X n (tt), n > 0, all tt) 2.3-2 D t (tt, n) = 2 - 3 - 3 2.4 Descriptor to FPL Production Mapping For each group, the descriptors are then mapped into FPL productions. If a is a string of length n > 1, the three mapping rules are: If production tt is M -* ON . . . 2.4.1 then (tt, n) maps into: a | | * Nh If production 7T is M -» at . . . 2.4.2 then (tt, n) maps into a | | * t (tt, n + 1) If production 7T is M -» a 2.4.3 then (tt, n) maps into a I -> M I Mt In addition, two special groups must be generated. They are: START: Eh: | | * Sh _ (0, 3): S _ | - £ | Success Exit 2.5 Preclusion and Error Production If, within any group, the stack comparison string aN . . . 3«1«1 then (it, n) maps into a I «- N I * Nh where *- N means push N onto the stack. The use of the bar symbol makes the lookback string f3 the same for all FPL productions, that is, one symbol which may be any bar symbol. As such it will not be noted in FPL productions from this point on. This device differentiates in all but one special 12 case (associated with the removal of a bar symbol) "between two productions which have stack comparison strings of different lengths. 3.2 ar Symbol Removal The bar symbol N is inserted in the stack when it is known that nonterminal symbol N is a subgoal of the parse, and should, therefore, be removed when N has been recognized. It is implicit that this has just happened upon entry to the Nt group. If, however, N is explicitly left recursive (ELR), that is, there exist ELR BNF productions ir. of the form N -> Na then the reduction is not complete until the corresponding FPL pro- ductions (i.e., those with descriptors (ir. , l)) have been processed. This type of FPL production will be referred to as ELR. The Nt groups thus must be semi-ordered. The ELR produc- tions must come first, followed by a bar symbol removal production, and then the rest of the productions of the group. The bar symbol removal production is of the form NN I •* N I In this case a = NN, 3 is empty (not any bar symbol), and the lack of a label for transfer indicates that the next production in se<; should be processed. 13 All productions of an Nt group have N as the last symbol of the stack comparison string. All of the ELR productions have stack comparison strings which are precisely the single symbol N. The bar symbol, which would otherwise prevent the preclusion of the longer strings, may be removed between the processing of the ELR productions and the processing of the non-ELR productions. Therefore, the ELR productions will preclude all other productions of the group. Lookahead strings must be generated for the ELR productions to eliminate this preclusion. 3-3 Nta and Ntb Groups The non-ELR productions of an Nt group fall into two classes: those with descriptors (it, l), called class A; and those with descriptors (it , n) with n > 1, called class B. Just prior to a transfer to the Nt group the stack has N in the top position and a bar symbol in the second position. If the bar symbol is N, it has been put into the stack by a FPL production of the type: a I «- N * Nh which by mapping rule 3«1«1 comes from the BNF production it of the form M -» ON . . • Ik Therefore, the production in the Nt group which must apply (after ELR productions have been processed and the bar symbol removed) has the descriptor (ir , n + l) where n is the length of the non-empty string OL. This production is in class B. If, on the other hand, the bar symbol is not N, then the FPL production which must apply (after ELR processing) comes from a BNF production 7T of the form: M -> Na and has the descriptor (jr , l). This is a class A production. Thus, the bar symbol in the second position of the stack can be used as the parameter for a dynamic transfer to either a group labelled Nta which consists of the ELR productions and the class A productions, or a group labelled Ntb which consists of the ELR productions, the bar symbol removal production, and the class B productions. Since the bar symbol removal production cannot apply in the Nta group, it need not be included and ordering is unnecessary. The Ntb group is identical to the Nt group in every way, except that the class A productions have been deleted. It is possible for only class A or class B non-ELR productions to exist, in which case only Nta or Ntb groups, respectively, need be built. Ntb {it , n) Groups The Ntb groups can be further split. This is done by associating a descriptor with each bar symbol as it is pushed into the stack. If the FPL production which pushes N into the stack has 15 the descriptor {it, n) , then the descriptor for the non-ELR Ntb group production which must eventually apply is (it , n + l). Therefore, a separate group, labelled Ntb ("IT, n), is built for each class B production. An Ntb (it, n) group contains the Nt ELR productions, followed by the bar symbol removal production, and the single pro- duction with descriptor (if, n) . The splitting of the Nt group into Nta and Ntb (tt, n) groups completely eliminates the need for lookahead to prevent the preclusion of one class B production by another. Even more import- ant is that if an ELR production needs several multisymbol lookahead strings (i.e., 3, Qb, ac, Qd, . . . ) to prevent it from precluding a particular non-ELR production but needs fewer, shorter lookahead strings (i.e., p, a) to prevent preclusion of all other ELR and non-ELR productions, then the larger set of lookaheads is applied only when necessary. This splitting of the Nt groups and the associated dynamic transfers make it necessary to change the special START group to make it consistent. It becomes: START: Eh: ]_ | «- S (0, 2) | * Sh where the special BNF production Z -> ]_ S j is assumed to be number zero. 16 3-5 Summary With the extensions described above the various rules given in chapter two are changed to the following: a) Label Determination: For each N e V : 3-5-1 label Nh exists if 3 TT, n : N = X (ir), n > 1 n For each N € V : 3-5-2 label Nta exists if 3 TT : N = X (ir), ir not ELR For each N e V : 3-5-3 label Ntb (x, n) exists for each ir , n : N = X (ir), n > 1 For each t £ V : 3-5- 1 * label t (tt, n) exists for each ir , n : t = X (tt) : n > 1 n 17 b) Descriptor Set Generation: D Nh = [(jr > 1) I X l (Tr) € V T 3 " 5 * 5 and the LHS of .production tt is a head symbol of N} D Nta = [ijr > D I N = x ! (t)} 3-5.6 D Ntb (T, n) -«V 1} ' N = X l (7r i^ 3 ' 5 ' 7 7T is ELR} U {bar symbol removal production} U {(tt, n)} (with ordering) D t (r, n) " {T > n) 3 - 5 - 8 c) Mapping Descriptors to FPL Productions: (a of length n > 1) If production 7T is M -»• aN . . . 3* 5*9 then (tt, n) maps into a, I ♦- n (tt, n + 1) I *Nh If production tt is M -» at . . . 3-5.10 then (tt, n) maps into a I -* M I *t (ir, n + 1) 18 If production 7T is M-»a 3* 5*11 then (ir, n) maps into a | -* M | DMt where, if K (ir , n) is the bar symbol at stack level two, then K = M => DMt = Mtb (IT , n) K * M =^> DMt = Mta 19 1+. COMBINED GROUPS A further extension of the algorithm, which makes it possible to accept a larger class of grammars, is the combination of two or more FPL productions in a group into a single production. Two productions are combined (if possible) when the first precludes the second and the finite lookahead is not sufficient to differen- tiate between the two. Each production which is formed by such a combination creates the need for one or more additional groups. k.l Ct (m) Groups The simplest combination possible is of two or more FPL productions formed by mapping rule 3* 5*10. If, within a group, the productions a a t (-n^, n) I * t (7T k , n) should appear, and if a differentiation by lookahead is not possible, then they can be combined into the single production a *Ct (m) U.l.l 20 where this is the m-th such combination to have been made. Associated with the integer m is the set of labels £ Ct(m) = {t (7r i' n) I ± = lf 2 > ' ' '> k] k ' 1 ' 2 The descriptor set of Ct(m) is defined by: d„w n = y D 4.1.3 ct(m) ^ct(m) q Group Ct (m) is then precisely the union of the various groups which could have been transferred to had the combination not been made. The Ct (m) group generated above is: Ct (m) : at I . . • at ... error As is easily seen, the preclusion has not been eliminated, but simply moved to a different group. This group, however, has one more symbol in the stack comparison field and therefore the maximum length look- ahead c reaches one symbol further into the input. The net effect of a combination of this type is the implicit one symbol < nsion of the maximum lookahead capability of an implementation of the algorithm. 21 k.2 ch (m) Groups Another possible combination is of productions formed by mapping rule 3* 5*9 which cause bar symbols of different nonterminals to be pushed into the stack. This cannot be done under a certain condition, but it will be shown later in this chapter how this condi- tion can be eliminated. If, within a group, the productions a a | *- N x ( 7 r 1 , n) I * Kjh N, (w. , n) * N v h k v k k should appear, and if a differentiation by lookahead is not poss- ible, then they can be combined into the single production a (m) ch (m) 4.2.1 where this is the m-th such combination to have been made. The symbol (m) is called a metabar symbol and represents the set of bar symbols which could have been pushed into the stack by the various produc- tions which were combined. That is: (m) = {^ (tt i , n), . . ., N k (ir k , n)} 1+.2.2 The sets represented by the metabar symbols must be avail- able when the FPL productions are used to parse an input string. 22 re axe two reasons for this. The bar symbol removal production NN | - N | must apply when stack level one contains N and stack level two con- tains N, or stack level two is (m) and N e (m) . The second use of these sets is in applying the dynamic transfer, DMt, defined in 3.5.11. The definition of DMt remains the same if K (tt, n) is the bar symbol at stack level two. However, if the symbol at stack level two is (m), then DMt is defined as follows: M (tt, n) g (m) => DMt = Mtb (it , n) 4.2.3 M (tt , n) £ (in) => DMt = Mta The metabar symbol (m) also satisfies the implicit one symbol look- back for any bar symbol. The production formed by a combination of this type indicates a transfer to label ch(m) . This type of group is necessary whenever such a combination is made. Its descriptor is indirectly defined by (m) as follows : ch(m) - ■ /— n IJh v ' N e (m) 23 Again the new group to be built is exactly the union of all the groups which could have been transferred to by the productions which were combined. U.3 CNtb (m) Groups A third type of combination is of productions formed by mapping rule 3-5*9 where the bar symbols to be pushed into the stack are all for the same nonterminal symbol. If, within a group, the productions a N (tt., n) * Nh a I 4- N (tt, , n) | * Nh should appear, and if a differentiation by lookahead is not possible, then they can be combined into the single production a N * (m) * Nh U.3-1 where this is the m-th such combination to have been made. The special bar symbol N * (m) is the same as N in terms of implicit lookback and bar symbol removal. It does, however, call for a further definition of the dynamic transfer, DMt, of 3«5-H and k.2.3- This transfer remains the same under the previously defined 2k stack level two conditions, but is extended when stack level two is • (m) as follows: K = M => DMt = CMtb (m) U.3.2 K * M => DMt = Mta The label of a group which is built whenever a combination of this type is made is CMtb (m) . Associated with the integer m is the set of labels: £ CMtb (m) = {Mtb (7r i' n) | i = 1, 2, . . ., k} U.3-3 The descriptor set for this new group is then defined by >W W - ,e, CMtb I \ In this case the productions which were combined each pushed into the stack a bar symbol which defined a dynamic transfer to an Mtb (tt f n) group. This new group which must be formed is the union of those Mtb (tt, n) groups. It must include the bar symbol removal production and be semi-ordered in the same way as a Mtb (tt , n) group. 25 4.4 N * (m) as an Element of (k) An extension of the devices of sections 4.2 and 4.3 allows the combination of any set of productions formed by mapping rule 3- 5 • 9« If the bar symbols to be pushed into the stack by the productions to be combined are mixed, in the sense that some are of different nonterminal symbols and some are of the same nonterminal symbols, then the procedures of the previous two sections do not apply. In this ease the procedure of section 4.3 is applied to whichever sets of productions qualify, and the procedure of section 4.2 is applied to the resulting set of productions. Two of the rules of section 4.2 must be amended in this case because metabar symbol (k) now defines a set which contains both bar symbols, N (j:, n), and special bar symbols, N * (m) . First, the part of the definition of DMt of 4.2.3 must be changed. It becomes: If the symbol at stack level two is (k), then 4.4.1 M {it , n) e (k) => DMt = Mtb (ir , n) M * (m) € (k) => DMt = CMtb (m) M (7T, n) DMt = Mta 26 Rule U.2.4 which defines the descriptor set for ch (k) must also be changed to read as follows : D ch 00 - 1 • (k) u D Nh h - h - 2 N * e (k) U.5 Initial Revision of BNF Grammar Section U.2 made reference to a condition which prevented the combination of productions as described in that section. This occurs when the nonterminal symbols associated with the bar symbols to be pushed into the stack by two of the productions are such that one is a head symbol of the other. That is, when N (tt . , n) and 1 (tt ., n) would be elements of (m) and N is a head symbol of M. J This will always cause a dynamic transfer to Ntb (tt , n) upon recognition of N. This ignores completely the possibility that this particular occurrence of N is the beginning of a substring which will eventually reduce to M (in which case the transfer should be to Nta) . The problem can be overcome by a simple revision of the BNF grammar before any BNF production to FPL conversion is done. This involves a search of all the BNF productions for pairs of the form: ON B ■+ OtX . . . 27 where A may or may not be the same as B, a is a string of length n > 1, and X e V = V U v, and X is a head symbol of N. This pair is then replaced by A -» ON . . . B -+ aQ (Q is a new nonterminal symbol. ) and the production Q -> X is added to the set of BNF productions. This revision of the BNF grammar does not change the language being specified but does keep the previously described condition from occurring and preventing the combination of pro- ductions when necessary. Allowing X to be a terminal symbol in the revision also makes certain that there will never be preclusion involving FPL productions generated by both mapping rule 3«5«9 and mapping rule 3«5«10. Thus, the only case where combination of productions cannot be used to overcome preclusion is when one or more of the FPL productions involved has been generated by mapping rule 3«5«H« 28 U.6 __ The total algorithm is now extended to include an initial BNF grammar revision. In addition, preclusions which cannot be resolved by lookahead and do not involve an FPL production which is generated by mapping rule 3« 5*11 may be eliminated by combining productions and creating new groups. These combination procedures use the new special symbols (m) and N * (m) . The definition of the dynamic transfer label, DMt, is extended to: If stack level two is K (ir, n) then 4.6.1 K = M => DMt = Mtb (fr,n) K * M => DMt = Mta If stack level two is (m) then 4.6.2 M (tt, n) e (in) => DMt = Mtb (it, n) M * (k) € (in) => DMt = CMtb (k) M (tt, n) i (in) and M * (k) ^ (m) => DMt = Mta If stack level two is K * (m) then U.6. 3 K = M => DMt = CMtb (m) K * M => DMt = Mta 29 5- SEMANTICS AND EMPTY BNF PRODUCTIONS The algorithm described in chapters one through four has been implemented on a Burroughs B5500 computer at the University of Illinois. The input to the program is the grammar of a language in the form of a set of modified BNF productions. The output is a stream of pseudo orders corresponding to the FPL productions gen- erated, and a set of tables, one of which contains the definitions of the metabar symbols. 5.1 Semantic Routines The BNF productions include two different applications of semantics. The semantics are written separately as a set of num- bered routines which are accessible to the parser interpreter which causes the pseudo orders to be executed. The semantic routines are associated with the BNF productions by placing the appropriate number (preceded by #) anywhere on the RHS of a production, except at the beginning. For example: 1) 2) 3) = a #1 = a #2 = a #3 #1 would all be valid input productions. Since the total set of FPL productions has at least one production for each possible descriptor, 30 the semantic routines can be executed precisely when the parse reaches the point indicated in the BNF grammar. This requires an extra field in the FPL productions which applies in the same way for all of the mapping rules. The general FPL production becomes Ll:0a|7-+N|m*L where m is an integer referring to a semantic routine which is to be called immediately upon recognition of p, a in the stack and 7 in the input stream, and before any stack manipulation is done. All other fields of the production are as previously described. The mapping rules are extended to include the setting of this semantic action field when the descriptor being mapped is (ir, n) and # follows symbol n of production ir . Each stack position, in addition to containing a syntactic symbol, also contains a field which the language designer may use (through semantic routines) for information associated with that particular syntactic symbol. The semantic routines are also used for many other things such as building identifier tables and gen- erating object or intermediate language code. The use of semantic routines does place one restriction on the algorithm. A set of FPL productions which should, accord- • to the procedures described in chapter four, be combined, may not be combined if they have different semantic actions. This is 31 because the appropriate semantic routine must be executed before continuing the parse. For example, the h group, from the above BNF productions, would be h : a | *. B (l, 2) | * Bh a | «- C (1, 2) | 2 * Ch a | «- D (l, 2) | 3 * Dh error The preclusion involved here must be overcome by lookahead or not at all. This restriction has little practical effect on the capa- bilities of the system since, in actual use of the system, the vast majority of semantic routine designations are at the end of RHS's, (i.e., semantic actions to be performed upon recognition of complete constructs) . 5»2 Semantic Conditions A second use of semantic routines allows the language designer to effect the parse by differentiating, with semantics, between otherwise identical syntactic symbols. Semantic routines used for this purpose are called semantic conditions and are denoted in both BNF and FPL productions by <#m> where m is an integer and refers to semantic routine number m. These semantic conditions are associated with a particular use of a symbol (either terminal or nonterminal) by inserting them into the BNF productions 32 immediately following that symbol on the appropriate RHS. The system considers the semantic condition as a permanent part of the symbol which precedes it and, in fact, treats that symbol as being syntactically different than the same symbol with a different semantic condition, or with no semantic condition. The following example illustrates the use of semantic conditions. Assume the following productions are part of a BNF "rammar . -* a <#1> -> a <#2> - a The Ah group then would be Ah : a <#1> | -* A | DAt a <#2> | -v A | DAt a | -> A | DAt error No preclusion exists since the three single symbol stack comparison B are considered by the system as being different symbols. The semantic routines which are used as semantic conditions may do anything that any other semantic routine may do. They must, however, also set a fixed boolean variable in the skeleton program 33 either true or false. The symbol X <#m> in the stack comparison string of an FPL production is processed by first comparing the symbol X with the stack in the normal manner and then executing semantic routine number m. If the value returned is true, then the symbol is considered to have matched the stack and processing of that production continues. If it is false, the stack compar- ison has failed and processing is transferred to the beginning of the next production in sequence. The use of semantic conditions increases drastically the capabilities of the system in that they can be used in conjunction with normal semantic routines to refer back to every step of the parse which has preceded their execution. They must be used with great care, however, since a <#n> is a preclusion which is not recognized by the system. In addition, <#n> <#m> may also be a preclusion not recognized by the system if the user has failed to effectively differentiate between the two cases. 3U 5.3 Empty BNF Prod-actions An empty BNF production is N - \ where X is the null string. The use of the empty production is unnecessary in specifying a language. It is, however, useful in that it can make the specification of a grammar easier for the language designer. Since this is one of the primary goals of this system, it is allowed. The algorithm, as described, does not handle the empty production, so an initial pass is made which completely removes all empty productions by a straight- forward back substitution. (It is possible to completely remove them because of the form of the production E ■+ J_ S J_. ) This may affect semantic routines, as in the following example : A -*• aBc #1 B -> \ B - b 35 which becomes A -* aBc #1 A -* ac #1 B -* t> Semantic routine number one may refer to stack level three expecting that it is the level of symbol a. Whenever such an effect is poss- ible (i.e., whenever the number of symbols preceding a semantic routine call is reduced) a message is printed to that effect and processing is continued with the assumption that the semantic routine is not affected. It becomes the user's responsibility to rewrite the semantic routine if necessary. This is not a serious problem since, in most cases, the syntactic part of the language is written and processed prior to the actual writing of the semantics. 36 6. IMPLEMENTATION This chapter will describe the more interesting features of the implementation. Emphasis will be given to those parts of the algorithm l) which this implementation may modify slightly, or 2) for which the method of implementation is not relatively obvious . 6.1 BNF Production Storage The BNF productions are first converted into a numerical form and the RHS's are stored in a logical one dimensional array called PR0TAB. Each RHS symbol uses one word of the array with an extra word for semantic conditions. Each word is broken into sev- eral fields which contain the following information: 1) a two part numerical representation of the symbol 2) which (if any) semantic routine call follows this symbol 3) the number of words to the beginning of the next RHS U) whether this symbol is semantically conditioned 37 5) whether this is the leftmost symbol of and ELR production 6) the nonterminal symbol which is the left side of this production. All of the symbols used in the algorithm are separated into the following types: nonterminal symbols terminal symbols bar symbols metabar symbols special bar symbols semantic conditions identifiers numbers strings The semantic conditions are not actually symbols but the implementation, in many ways, treats them as if they are. Identifiers, numbers, and strings are specified in the BMF grammar by <*I>, <*N>, and <*S>, respectively. These are considered by the system to be three different terminal symbols. These nine differ- ent classes of symbols are each given a unique type number. The second part of the numerical representation of the first six classes of symbols is a number corresponding to a 38 particular symbol or semantic routine within that class. Identifiers, numbers, and strings have no such entries. The scan procedure of the skeleton program recognizes these as entities and uses the entry- field in the stack as a pointer to the address in the ECD symbol table of the particular identifier, number, or string recognized. This pointer then can be used by the semantic routines. Special (or reserved) words of the language are specified in the syntax as alphanumeric character strings (i.e., #BEGIN) and each is given a separate entry number within the class of terminal symbols. One of the tables output by the conversion program is a list of these special words and their associated entry numbers for use by the scan procedure. The above described method of storing the BNF productions allows the conversion routine almost immediate access to most of the information it needs, while at the same time using as little space as possible. During the course of this conversion certain checks are made of the syntax. Among these are tests that all non- terminal symbols (except Z) appear on both right and left sides of BNF productions, and a check that there is at least one string of terminal symbols which will reduce to each nonterminal symbol. ■-•jnmar Revisions During the course of the BNF production storage, the non- terminal symbols which occur on the left side of empty productions are stored. This information is used to eliminate all empty produc- tions in the manner described in section 5 -3- 39 At this point, two dimensional boolean arrays which mark the nonterminal head symbols (NTHS), the terminal head symbols (THS), and the nonterminal tail symbols (NTTS, defined analogously to head symbols) of each nonterminal are filled. For entry into the THS array the special terminal symbols identifier, number, and string are put at the end of the other terminal symbols. These tables are initialized as follows: If g a production M -+ N . . . then NTHS [M, N] = true else false If g a production M -*■ t . . . then THS [M, t] = true else false If g a production M -»■ . . .N then NTTS [M, N] = true else false where a symbol used as a subscript refers to the entry field for that symbol (except identifier, number and string). The tables are then filled by iteratively applying the principal that if X is a head (tail) symbol of N and N is a head (tail) symbol of M, then X is a head (tail) symbol of M. These tables are completed by setting NTHS [N, N] and NTTS [N, N] equal true for all non- terminal symbols N. At this point the RHS's are compared in pairs and all necessary grammar revisions are made as described in section k.^. Uo The BNF grammar is now in a form acceptable to the conversion algorithm and the NTHS and NTTS tables are available for use when applicable. Before conversion actually begins, a table of pointers to each RHS occurrence of each nonterminal symbol, and a table of pointers to RHS beginnings corresponding to the use of each non- terminal symbol as a left side are constructed. These are used to eliminate excessive scanning of the entire array of RHS symbols. 6.3 *abel Determination Very little is necessary for label determination. All nonterminal symbols require an Nt group which is separated into Nta and Ntb (ir, n) groups in a manner which will be further dis- cussed later. A single scan through the BNF production table (PRjfcAB) is sufficient to determine which Nh groups are necessary. No label determination is done for the t (?r, n) groups. Instead, when the conversion of an FPL production to pseudo orders reaches a label for transfer to a t (tt, n) group, the single FPL produc- tion of that group is generated and the pseudo orders for it are placed directly into the pseudo order stream. This saves unnec- cessary transfers when the pseudo orders are processed. The last production in a sequence of su< "rations is followed by the error pseudo order. As the FPL productions are generated in the Nta, (tt, n), and Nh groups the need for Ct ("ir, n) , ch (m), and (m) groups arises. Whenever such a need occurs the name Ul of the group needed is simply inserted into a list, along with a pointer to the information necessary to build the group. These groups are created one at a time after the Nta, Ntb (ir, n), and Nh groups have been built. The generation of a combined group may add to the list of combined groups which are needed. 6.U Descriptor Set Generation A descriptor will continue to be denoted by (ir, n) . It is, however, actually a single number which points at a word in PRj^TAB. The descriptor set for a Nt group is then precisely the entries for N in the table of pointers to RHS occurrences. The descriptor set for a Nh group is generated by using information available in each word to step through the first symbol of each RHS in PRj^TAB. If this first symbol X is a terminal symbol and if the left side M of this production is such that NTHS [N, M] = true, then the pointer to that first symbol is a descriptor in the Nh group. The descriptor sets for the Nh groups are saved, even after the groups have been built. The descriptor denoted (ir, n + l) is (ir, n) + 1 if the symbol at (ir, n) is not semantically conditioned. It is (tt, n) + 2 if the symbol at (ir, n) is semantically conditioned. The single descriptor for the t (tt, n) group is available when the need for that group arises. The descriptor sets for Ct (tt, n) , ch (m), and CNtb (m) groups are generated by a straight forward application of U.1.3, k.2.k, and k.3.k. The label sets £_. , N and £___. , N are ' ' Ct (m) CNt (m) the information associated with names, as discussed in section 6-3« k2 6.5 Descriptor Set to FPL production Mapping The mapping of descriptors to FPL productions is very simple. The descriptor is first increased by one if the symbol pointed at is semantically conditioned. The stack comparison field of the FPL production is thirty-two (32) words long and is filled, right justified, directly from PR0TAB using the descriptor. The next word is the semantic routine to be called and is filled from the appropriate field of the last word in the stack comparison. The next word is minus one, one, or zero corresponding to «-, ->•, or blank, respectively. The following word is the bar symbol to be pushed, the nonterminal which the stack is to be reduced to, or zero. The next word is one or zero for scan or no scan, respec- tively. This is followed by the two part numerical label name. These labels are formed similarly to the symbols in that there are five classes or types. The entry fields are filled as follows: Type Entry Ct (m) m ch (m) m Nh N t (n, n) (n, n) DNt N h3 The last word of a numerical FPL production is the revised descriptor from which it came. Much of the information in the production is redundant. This format is used because it simplifies the conver- sion into pseudo orders, a list of which is given in Appendix A. 6.6 Preclusion Elimination The splitting and ordering of the Nt group is more easily described if the elimination of preclusions is explained first. Assume for this explanation that no ordering or splitting is necessary. All of the numerical FPL productions for a particular group are generated and placed, one per line, into a table for processing. The first production is compared, one at a time, with the productions which follow it. If the stack comparison string for the two productions is identical, then a preclusion exists and lookahead generation is necessary. The two productions are asso- ciated with two identical buffers. Each line of these buffers has two words of control information plus n additional words, where n is the maximum number of symbols of lookahead. The two productions are expanded into the first lines of the two buffers as follows : word one: the entry part of the left side of the BNF production which formed this FPL production. kk word two: zero the rest of the line is filled, left justified, by the symbols of this RHS which are not in the stack comparison field (as many as will fit). Figure 1 is the flow chart for the generation of look- ahead with the variables used defined as follows: I, J, K are loop indices LK1 is the buffer associated with the first FPL production LK2 is the buffer associated with the second FPL production LL is the maximum lookahead length LK1PT is a pointer to the first empty line of LK1 (initially equal two) LK2PT is a pointer to the first empty line of LK2 (initially equal two) The procedure GETSYM extends the string in line I (J) of buffer LK1 (LK2) with pointer LK1PT (LK2PT) by finding all PRjfrTAB urrences of nonterminal symbols which are not the last symbols h5 ( START V !•— 1 ©- £> YES J*— 1 K«— 3 YES YES YES EXIT EXIT GETSYM (LM.LKIPT.I) J GETSYM " — j*i LK1[I,2]«- K-2 «•• K+l © at N p -* Oft . . . N, -+ at . . . k The transferring FPL production is: a I * Ct (m) The Ct (m) group contains the PFL productions at at at or combinations thereof. A^ain, only a check of the top of the :'or the symbol t is necessary. 53 All transfers to Nh or ch (ra) groups are made from FPL productions of the form: a | ••- Q | * L where Q is N (if , n) , (m), or N* (m) and L is Nh or ch (m) . Therefore, stack level two must contain a bar symbol upon entry to an Nh or ch (m) group. Since all the FPL productions in these groups are of the form the implicit lookback for a bar symbol is unnecessary. All transfers to Nta groups are made from FPL productions of the form: a | -* N | DNt After this reduction stack level two is some bar symbol and stack level one is N. All FPL productions of an Nta group are of the form: N Thus no stack comparison is necessary and lookahead alone determines which FPL production applies. Using the implementation described 5** in chapter six, the last production of an Nta group will have no lookahead and must apply. Therefore, no error production is necessary after this group. The configuration of the stack upon entry to a Ntb (ir, n) or CNtb (m) group is the same as upon entry to a Nta group. These groups begin with the ELR productions of the form: N which requires no stack comparison. The dynamic transfer DNt is defined in such a way that the bar symbol removal production must apply if the transfer is to a Ntb (ir , n) or CNt (m) group, so no stack comparison is necessary for it. This leaves only the non-ELR production(s) . If the bar symbol is N or N * or (m) where N or N * e (m), then it was pushed into the stack by a | 4- N pr N * | * Nh or a | «■ (m) * ch (m) In either case, this came from a BNF production of the form M -* r /ll . . . (and possibly others of the form M -> OK . . ., where : N). The transfer DNt, in each case, is to Ntb (tt, n) or CNtb (m) which has as on -ELR product ion( s) : cm 55 which applies with no stack comparison. Again, only lookahead differentiates between the productions and no error production is necessary. 7.2 Variable Lookahead Length The lookahead generation was described in section 6.6 as being done once for a given maximum lookahead length. A vast majority of groups for any reasonable grammar (based on actual use of a translator using this algorithm) can be built with a look- ahead of zero or one symbols. All of the nontrivial grammars which have been input to the TWS have had a few groups which needed up to three symbols of lookahead. A one symbol lookahead generation, which assumes a possible maxijmum of three symbols, is very ineffi- cient since extra (unnecessary) symbols must be retained at each step. To overcome this inefficiency the algorithm without com- bination of productions is first applied with an assumed maximum lookahead length of one. The names of groups which cannot be built in this first pass are stored in a table. The algorithm with com- bination of productions and a maximum lookahead length specified by the language designer (up to fifteen, with default of three) is then applied. This two pass approach has proved, in practice, to signif- icantly increase efficiency. 56 7»3 . . ■ rminal Symbol Expansion The flow chart for lookahead generation (Figure l) indi- cates a looping on the procedure NT2T. The purpose of this looping is to ret all n (or less) symbol expansions which begin with a terminal symbol of a particular nonterminal. In this case n is the number of symbols which, beginning at word K, would fill line I of LK1. Such expansion of a nonterminal symbol may occur several times in the building of the various groups. A special procedure has been added which uses the princi- ple of the loop through NT2T to generate the desired expansion and save the results, or simply returns the strings directly if the expansion has been done previously. The use of this procedure replaces the two loops in re 1. This also significantly improves efficiency. J .k >seudo Orders The generation of pseudo orders will not be discussed in any detail since it is relatively obvious what orders are necessary. Three things, however, should be noted. First, two different tests for terminal symbols in the stack or lookahead must exist. One, for identifiers, numbers, and strings, tests only the type, and another tests both type and entry. It was shown in section 7-1 that the entire stack compar- ison is unnecessary. It is, however, necessary, for productions of the form: a I -* N I DNt 57 to retain the length of the string a, because that is the number of symbols which must be replaced by N at the top of the stack. A relatively complex set of pseudo orders combined with proper ordering of the lookahead strings for a given FPL production can reduce the number of symbols which must actually be compared. A simple example is the following. Assume the lookahead strings for a production are : a b c e f g a b d If these are reordered to a b c a b d e f g it is easily seen that two separate tests for ab are unnecessary. 7-5 Error Recovery Section 7*1 shows that the bar symbols are not necessary for lookback. They continue, however, to serve two very useful functions. The first, which has already been discussed, is to act as the parameter for the dynamic transfer, DNt. Their second, and 58 equally important, function is their use in error recovery. Error recovery is not a necessary part of the algorithm, but a translator based on the algorithm must have same form of error recovery if it is to be practically useful. One of the tables output by the TWS for the skeleton program is a set of lists of all terminal symbols which may follow each nonterminal symbol. These lists are generated by properly initializing LK1 and then initiating a one symbol lookahead .eration. A syntactic error is found in the input string by execu- tion of the error production at the end of same group. When this occurs, the stack is searched for the uppermost bar symbol. If this symbol is N or N *, the input stream is scanned until one of the terminal symbols t which may follow N is found. The symbols between the N or N * in the stack and the t in the input stream are deleted. The nonterminal symbol N is then pushed into the stack and the dynamic transfer to DNt is made. If the uppermost bar symbol is (m), then the input stream is scanned for a terminal symbol t which may follow any of the non- terminal symbols M which are such that M e (m) or M * e (m). If t may follow only one of these, the procedure described above is applied. If it may follow several, then one is chosen arbitrarily and the procedure is appli( This error recovery can be extremely effective or extremely ineffective depending on the state of the parse when an error is 59 encountered. The worst case occurs when the uppermost bar symbol is S. If S is not recursive this causes the entire input string to be deleted. Experience indicates, however, that the error recovery just described is, in general, about as effective as that of many hand coded compilers, particularly for languages, such as ALGOL, which are specified recursively. 6o 8. CONCLUSION A TWS based on the algorithm described in the preceding chapters has been fully implemented. In general, it satisfies the goals set for it in chapter 1. This chapter briefly compares it to other parsing algorithms and discusses the subclass of grammars which it will accept. There are also brief discussions of problems yet to be solved, of potential improvements, and of areas of further research. 8.1 Precedence Systems The parsing algorithm can be compared to the precedence systems of McKeeman [1], and Wirth and Weber [2] in the sense that of three stack actions is specified by each FPL production. The first of these is to do nothing, which is equivalent to a precedence relation of = . The second is to push a bar symbol into the stack. This is done when a precedence relation of <• exists and occurs at the top of the stack, rather than down in the stack as in the tech- niques of McKeeman, and Wirth and Weber, but effectively serves the same function. The third stack action is to make a reduction, which equivalent to a precedence relation of •> at the top of the stack. One advantage of the conversion algorithm (and, hence, of FPL rec< :: produced) is that, with one exception, it is able to make use of more context than is employed in precedence relations in determining the bounds of a substring to be reduced. 61 In determining the existence of a <• relation both prece- dence schemes consider one symbol below it in the stack. McKeeman makes use of two symbols above it and Wirth and Weber one symbol. The conversion algorithm uses a minimum of one symbol below it in the stack plus further information, which may be implicit in the grouping, and a lookahead (corresponds to symbols above it in the stack) of n symbols, where n is usually greater than two, to decide when it is necessary to push a bar symbol into the stack. In determining the existence of a •> relation at the top of the stack, both precedence schemes use a one symbol lookahead while the conversion algorithm uses n (n > l) symbols. Wirth and Weber's system looks one symbol into the stack to determine this relation, while McKeeman' s system looks at two symbols. One symbol is the minimum used explicitly by the conversion algorithm and the information available in the second symbol is, except in a Nta group, implicit in the grouping. Another advantage of the generated FPL recognizers over the precedence schemes is that they do not allow two BNF productions with identical RHS's. This is particularly important if several peo- ple are involved in the design of a language, or if the BNF grammar is to be used descriptively, as in the ALGOL report. 62 8.2 Right Bounded Context Let a = x, . . . x • . .x, , x. € V u v. 1 m tint Define (n, p) to be a handle of a if (for k > 0) N -» x, . . .x is the p-th production and x^ . . .x is the leftmost substring of a which appears as a RHS of a production and a reduction to S could begin with a reduction of this substring to N. Floyd [12] defines a grammar as right bounded context (m, n), denoted LR (m, n), if any handle is always uniquely determined by the m stack symbols to its left and the n input symbols to its right. The conversion algorithm clearly accepts all grammars which are LR (o, n), where n is the maximum lookahead length. It does accept grammars beyond this. The following grammar ir 1 S - a c N t 2 S - b c M 3 N - d k M - d t LR (2, 0), since in the strings acdt and bcdt, a decision as to ther the substring d is a handle (and should be reduced to N) can made only by looking two symbols to its left for an a or b. 63 This problem does not occur in the FPL recognizer since the Sh group Sh: a| | *c (l, 2) b| | *c (2, 2) effectively separates the two cases. This example can be extended to LR (rtij o) by replacing the c in the first two BNF productions with identical terminal symbol strings m - 1 symbols long. The grammar TV 1 S -> a N c 2 S -> b M 3 N - K h M - K c 5 K - d which is LR (l, 0), cannot be converted to a (deterministic) FPL recognizer by the algorithm because of the preclusion in the Kta group Kta: K| - N I DNt K| I *c (h, 2) which can only be resolved by a one symbol lookback. 6U In general, the subclass of Chomsky type two grammars accepted by the BNF to FPL conversion algorithm does not fit any known classification scheme. 8«3 Problems The present implementation of the conversion algorithm described in chapters 2 through 5 has two drawbacks. The first is that its execution consumes relatively large amounts of computer time. The second, and probably more serious, is that when a large grammar is not accepted by the algorithm, it is often difficult to see what changes should be made in the grammar which will make it acceptable without altering the language being defined. These two problems tend to emphasize each other in the sense that the time factor makes it impractical to make several runs with tentatively changed grammars in an attempt to find an accept- able one. For this reason it is important that the language designer be relatively certain that his grammatical changes are sufficient on the first attempt at correction. This, unfortunately, is pre- cisely what it is difficult to do. 8.U Potential Improvements One improvement which is being implemented is a rewriting the program with more efficient coding in an attempt to signifi- cantly decrease the execution time. There is reason to believe that aBp B -> 7 becomes 66 This would speed up the parse by cutting down the number of reductions, and associated stack manipulations, necessary at parse time. A third improvement is to keep the bar symbols in a separate stack and thus eliminate the stack manipulation necessary for bar symbol removal. This is possible since, as shown in chapter 7> the bar symbols are not actually used for lookback. Another potential improvement is the definition of a richer meta- syntactic language. As presently envisioned this would be some- thing similar to the regular expression of a Chomsky type three grammar as the RHS of productions. This would then be mechanically converted to BNF productions by a prepass. It is believed that this would make writing the syntax easier for the language designer. The final improvement presently being considered is a rela- tively complex combination of ideas to improve the automatic error recovery of compilers built by a TWS based on this parsing algorithm. 8. 5 Research Extensions Three interesting areas of research appear worthy of pursuit. The first is a new method of subclassification of Chomsky type two grammars which would clarify the properties which make a grammar un- acceptable to this conversion algorithm. The second is the investigation of the similarity of the FPL groups to the states of a finite state machine. This area is presently being investigated by F. L. DeRemer at MIT. The final area is the application of the principles of this algorithm to Chomsky type one and type zero grammars. 67 APPENDIX A THE PSEUDO ORDERS Njfy0P dummy placeholder for array row ends updating pointers LLVL ILVL initialize lookahead test pointer increment stack pointer stack symbol tests XSBT (T, A) XSBE (E, A) XSIT (T) XSIE (E) test stop stack symbol type-field for value T, false => branch to A test top stack symbol entry- field for value E, false => branch to A test top stack stymbol type-field for value T, false => print error message and insert correct symbol test top stack symbol entry-field for value E, false => print error message and insert correct symbol lookahead symbol tests XLAT (T, A) XLAE (E, A) XLBT (T, A) XLBE (E, A) test lookahead symbol type- field for value T, true => branch to A test lookahead symbol entry- field for value E, true => branch to A test lookahead symbol type-field for value T, false => branch to A test lookahead symbol entry-field for value E, false => branch to A 68 XLLT (T) XLLE (E) test lookahead symbol type-field for value T, false => branch to next production test lookahead symbol entry- field for value E, false => branch to next production stack manipulation TPSH NP0P (N) RED1 (S) REDN (N, S) push next terminal symbol to stack pop N symbols from stack reduce top stack symbol to symbol S pop N symbols from stack, then reduce top stack symbol to symbol S bar stack manipulation BP3H (B) BP0P push bar symbol B into bar stack pop a bar symbol from bar stack program control SETS (A) Gj&Tp (A) XBGp (E, A) SKIP (N) store A as address of next production transfer to address A test bar symbol entry- field for value E true => branch to address contained in bar symbol false => branch to A skip next N characters of instruction antic calls EXSM (E) XTSM (E) execute EXEC (E) execute EXEC (E), test SEMANTICTEST false => branch to next production 69 error XSLR (E) ERRT (E, A) ERRN (S, A) ERRR test for lookahead symbol valid after nonterminal E, false => execute ERRR insert terminal symbol E a.nd go to A change stack to S (reduce or push bar) by going back to address A skip to first symbol that can follow bar symbol and put appropriate nonterminal in stack and branch to address in bar symbol 70 <-> -o m ^ (n. iT * O Klf . 1 • • • III • ■ i I III I I I- 1/ >- l •- z to Z t/1 »- tO *~ a 1- *- to *- ►- t/ z z » 2 a z cc » vO to 1/1 tO tO to to i/> — — in — — MHM •O M MM •" *. M M M —. "" IC -. — <-. o *" — ■ —J -J . _j — 1 — 1 — * _1 _J — ■J -J -J _J _J —I _l _l _J _ i _j o o o o o o o o o o o o o O O o o o o o o mm a. cc 1EDO (CUD IS co m tn cc CC CO CD CD mis z z i I z z z z z z z z z z z z z z z z z >- >- >- >- >->->- >->->- V >- > V > >- >•>■>- >- >■ .o to ,o 10 to to to |/)I/1*/) 1/1 to to to to to to to to t/1 to o — — Cw CM CM C\J f\J l*> t*\ ^t ^^ iT i/i in /i it • ff ►- »- Z 7 %n \r> \ft *s> i/i • •- — •- ^- ^ 2 O m */■ »/■ fcr, »« k- • • • ♦ • cj x I I I i -> _» —i _> -J — » O I 1 1 1 I I I 1 I 1 1 I I 1 *—►-»- D- »- H- *- »- |_ *- t- t- I— »- z z z z z z I ■x z z z z z z l/ltflUI 1/1 • l/l to l/l tO t/1 • 1/1 to to • tO to • *•■""" •- z o •" •" •~ "" ~ z D ** t— t— z o t— t- z e to to to Ml »- to to 1/1 1/1 •o ►- to to tO tw to to »- • «-> • - t_ • • tU Z I I Z Z) Z z I I z => z z z => z Z 2 • » • - o • • • o • • o _._._> -I o -1 — 1 -J -J _l o -J _l _) o _l _l o Z ----- I Uj -C ^ tC G* O »- — Cti «"> « »- « «KD» >- a. a a. a. a a a a a cc cc cc. ex. cc • • • c c c o z Z Z Z U) l~ c c. c o o cc.ee b. o c o c o c c c o c *- a ir a a ex a a. or a 0. Or I no 1 a: a a a a a. a. a. a. a. a. a. a. a. >- a. a. a. a. a. a. a. a. a. a. 71 X w ft- X u. to UI VI ft- X ft- to ft- tO a » Z K O • a «. ZEZi> z a. t- ft- ►* ■"■ ft— i/> to mm i/)ia tO tO to to «o to to v) CM „ „ _. „ **) ~ ~ « tO _j i —> 1 I 1 -1 -J i i > i _j _ i o o o o o o o o o o o o o o CDS CO CO CC CO CD CO CD OD CD CD (SCO X X X X X X X X X X X X X X > >- >->->->- >- V >•>■>■>• >- >- ■SI tO uO lO CO l/v co to V> (A(A ID 4/1 tO tO */) t/> t/i i/> O O O O O 8) CD CO CD CD X X X X X co m in lO co ■ • ft- i/l o o CD CD X X CO CD X X >- >• to in o ie. rw »w k. t> o> » t> » z 1/) • X => • o _l o t i ft- ft- z z IIIIAK • • o X X => • • o -I -I o i • • • ft- »- ft- ►- 2 2 Z 2 CO to to tO ia _l _l -I _i o i • *— *- 2 2 */) tO *- • • «_> X X => • • o _l _J o • III ft- ft- ft- »- z z z z tO tO tO tO • to tO tO tO ft— • • • • o I X I X 3 ... .o _l -J _l _l D ftfl to • ft* •- z o to to ft- • C* X X = • c JJQ 1 I I I I ft- ft- ft- ft- ft- 2 2 2 2 Z to to to to to • .- — — •— — 2 o to to to tO tO ft- • • • • • o xxx x x => o _J _ I _ I _ I —I o I I. ft- ft- z z X X = • • o _J _l o to to *- • *o i x r> _l_IO - X •c ft- u c • c. C. 2 Z UI CD -a — - I ft". « ft- CM - CVJ m m <*) U o c c c c 2 Z Z Z Z UI cd a. co a. «5 «3 « «3 - - X u a • • c G C 2 Z Z UI a co - - - - X tn « ►. tc ►- fO W *o ^ u. a ... .c o c c o z z z Z Z UI CO CO CO CD « «I « « - - X l> © ft- u o . »c ccz Z Z UI CO CQ •a «J »• (V «o W «J -o - - X « 1ft ft- U c - - X « ft^ ft- « •» u. o • » c C O 2 Z Z Ui cc a < < — — X K O ft- « tr u o COO z z z CO CO CD 0» O O m — « CM in in cm « — — — CM — CM — CM -" CM -" CM — i CM • • • • • Z 1 • • I • • 1 1 • • • • ,_ i |a l/- yj K 1/) ►- V) •- <« •- *r. t- iflt- ^ 2 a 2 • a • * Z «. Z a. Z «- 2»I» •/> >/> *o «n 1/) t/3 V) l/> lO «/)(/) l/> o CM #n « in -^ CM CM CM CM CM _* _i -j -i — 1 —1 —I _J —J _i — -j O o o o a a o a o O O a o o o a o co a. OD CD CD CD CD CD CD CD CD CD CD CD CD CD CD z J X X X X X X X X X X X X X X X >- )> >- > > v >- >■ >- > >• sv >- >>->->- */) */s _ _ - _ _ « * ^ ^ , , ^ ^ « « % « o i ^ ^^ ^1 • 1 1 • • I • I 1 I »- l_ _ »- >- |_ H ►- ►- *- X z 7 ^ .? z I I z z •J - MS ■J. Ifl «/-. . US \f> Ifl . »/> 1/) — z z z c c o «/■ •- IT vS •r. «n t— 1/5 V *s. ►- tC 1/3 • c CJ CJ X z> x 1 I i 3 X z. C 3 X X • c C O _ o -J — < _j _j O mJ -i -j O -1 -1 • • «-> X X => • • o -J _J o I I ►- ►- z z fcT W »- • • «J X X => • • o JJC t/> %n *n v/i * \r *s v. %r> +- . - - - 1_ X X X X => — < _i _» —i o 1 ■ • l_ ^ »- z ? z t/5 • 4A l/i • •- z •- z o o 1/1 t- l/> «/> »- • o • o X 3 i X 3 • o • • o _l o — 1 _1 c - X W M W - X - - - X M — X - - X - - X H M « «■ © •- — CN.»-> «T t*- m « ►- ►- « » »- o — t— CM *n c— «tr<»K •r IT S S J^ in m m m m •c « ■o « <; ■© o o Lv b. u u u. u c o O o o a • c . . . •e • • • c. . . c • • c - • c . . . . c z c c c C Z CCC2 c c z CD C Z c c z CD O C C Z w z z z z u J- ^ Z. u. z z LJ Z Z U/ Z Z LJ CM z z z z a a a a a ana CD CX. CD CD a. a a a a a « « « 4 41 « « 4 ^ 4 * 4 « «: « « « « ^ — *-*-¥- •—•->- *- »- »- ►- >- ►- >-»-►-(— c c c c c ceo o c o o o o c c c c a a a a a cr et a a a cr cr cr cr A cr cr or cr a. a. a a. 0. x a. a. a. a. a. a. a. a. _> a. a. a. a. c c a cr a. a. to to to to to to o o o o o X X X X X >->■>->->- to tO tO (O tO tn ^ « OO o «t O— Ok o »> cm in tsi •a « "*"* cm cti K mrg eg m n K. **» 1 • 1 1 I • • III I 1 z i I ■ X X »- »- ►- to X to •- to Uj >- to X to or or z z z OV or, » z «v O A «. ac m »-»- n- *"* *- tO tO m 1/3 tO to to to to to V) ■o to to M •-• CM MM o •"* M M M *"0 M M •-• —1 -J _J _t -J -J —J —1 1 1 _J _J oo a o OO o o o o o o o o o (SOD (DOB op CD oq oo go ao CD CD CD ODCO X X X X X X X XXX X X X X X >■ >- > >- >- >■ >- >->->- >- >■ > >- V tO tO Ul 1/] to tO to ultfltfl VO tO (O to to o o CD CO X X > >■ to to I I I I t- t- t- to to to to to o o o o CD CD GD CD X X X X to to to to • « «f «f I I I I Z Z Z Z 2 O ■mmm» -t m* • I I I I- H- |_ »- I X X X a ttd q • i • • z z z z • It t- t— t- z 2? z ftj CM I I I I I I t— tw t— t- Z Z Z Z to to to to to t- • - . * ■ t, X X X I X _> • .... o _ I _ J — I -J _J o to to k- X X z> to to to to . -. .- ~ .- Z o •O tO IO lO »- • • • • o X X X X 3 . . . .o -J —I -J —I o to to to to t- ... *o X X X X o ... .o _1 _1 _l _l o to to tO ¥- • . • o X X X 3 . . . c _l -I _l o to to t- . • o X X 3 to to ►- X X = to to to to to to to to X X X X CV **» O tf» MT ►- •* CC CC CD 0D GD t- © »« cm #n »- » «n o t~ — o at o o o o at ©• u U u. u u. u. o o o o o o . . c: • . . «c • . . .c • • • c • . c • "C _ . . . . c c z . o c c c z o o c o z >^ o c c: 2 c c z c c z m cccc z z UJ 00 CM Z Z Z Z UI z z z Z UI tv Z z z t- Z Z UJ Z Z UI z z z z CD CD «• 0D CD OD CD OD OD OD 0D m cc cc a CD CL cc o: cc cc a a. -»- »-»-►-»- t— t— »— t- t- #- »— t- t- DC ►- C O OO o o o c U DC C o c o o a. c o c c cc a a or ac ec or ac or or or to a ac or cc o; ac ac X ac ac a a 0.0. « a. a. a. a. a. a. a. a. —l o. o. a. o. a. a. o. UI a. a. a. a. 7 U •^ iT « « • III i t/i t/i 1/1 ^l t/» l/> t/1 t/1 c c c o X CC CC CG 113 1 V > >■ >• i/l ^1 t/1 1/1 o o o o C OD O CD X X X X >- V >• > «/l t/1 t/1 t/1 t/1 l/> kfll/l a o cc oo X X > >• l/l 1« o a ai oo X X >- >- in to o o o o CD CC CD CO X X X X l/l t/1 t/1 t/1 <> <> » e» 1 1 • 1 ■ 1 • • ^ _ _ _ ,_ »_ ^_ »_ T 3 V 2 2 ^ Z 2 Iff t/1 • *~ *~ *" ** 2 C *■ — ~ ^* ~ 2 • m • • O • • • • c -i — - c a a. — 1 ~ ' _j a a. W. K / »n h tf) •/> H- . x a xx3 xxo • o * * o * • o l-JO _l _ 1 O JJO •s, i/> l/> ./) •/!!/■*- • • o X X z> • - o m I/) t/1 • i/> t/1 «/>*/)»- m if h • • o • • t_ X X 3 X X => _| -I D _l _J O in 16 o c — <\ •- » o c o a. a a. a c c c c a a or a ia d a. • ill 7 Z 7 X t/1 »/l t/1 t/5 • ir tn i/. to »- . . - - 1_ X X X X 3 _» — I _J -J o iT, tr v> t- • • «_> X X 3 • • O _ _ _ _ x --x --x --x --x --x --x ----x " o J- < »^ *-«►- OCt- *«(V>- *"!«»- ^ « ►- *. « »- C* © •- CN. H- oooo oo © ~- *•»- ~ — »*•- «* »* — c\ c\. r\. — m «■ — u. — — u — — t* ^•-•i*. ~- — u — --. u — -~ u »-».^-»-u o—oooooo o ■ . . . . c r • • c • • c • • c • • c « • c • • c • . . .c ccc.cz « ccz ccz coz ccz ccz ccz ccccz Z Z Z Z ItJ Z Z UJ ZZbJ ZZUJ 2 X. Ul ZZttJ ZZu ZZZZUJ AI aa. (La a a a. or a. a. cc cc xa a a a. a, cc a, »-»-►—»- « t— ►- ►-•- ^-^- >->- ►-*- t-»- o *- t~ >~ t~ CCCC fioC oo oc oo oo oc <»>cccc ccacroc •» or a a a a a a or oror aa *t aaaa a. a. a. a a. a. a. a. o. o. a. a. a. a. a o. a. a. cl a. o c a a a. o. 75 X w • •!• I- »- ►- o «v «. • i t- 1/3 W) tfl tfl l/l to m • • « * <• —J —J —J — 1 _i _* _J —J _i-j — 1 _l —i —i —1 -J J —i _i J J _J —1 o o o a o o o o oo oo o o o o o o o oo OO CD CD CD CD CD 00 CD CD CDOD CD CD cd ao CD CD CD CD 03 CDOD CDOD X X X X X X X X XX X X x x X X X X X X X X X >- >- > > >■ > > > >>- »>- > >- > >- >- >- >- >- >- >->- 1/3 1/3 1/3 l/> ■a m l/> 1/3 m 1/1 l/l 1/3 W3 % » » » ^ ^ ^ ^ « « % , , % « ^ ^ o « « • « o in in in m o o «/3 C »/3 t/> o M in in m o W) 1/3 a 1/3 t/3 o V3 t/3 c m in c «/3 in in o «n- in m •- VI m *o t- *r. w"i ►- l/> IT. *- o c> c> CJ o o o u •J • • U X X r> X X 3 X -t. X I 3 X X 3 X X 3 X X 2 I I o X X X 3 X X 3 X X =3 o O e O o o o o O • - O JJC —1 -1 a _J J _l ■J a _1 —I o _l _i o _J JO _J _> -J _1 _l o -1 -1 o _l _l O a ac a or Or a a CI a CC a. CL ». a. a. a. 0. 0. a. CL t/> in «n 1/3 m «0 10 1/3 4/3 — — X — w X •* «• •* M X •■ •* X •• — X »• •• X M M X v «■ M X « — X — - X m « »- *- «3 ►- o> o v> CM ►- «» IW m « ►- K. Ct- » o ►- w* CM f»> ►- « m ►- « »- •- IMCM cm «vi CM **•» m f*> m «•) m (V) m m r-> « m « « « « «t 4* • • c • • c O o z o c z Z Z UJ Z Z UJ o o o c z z z z z u CD CC CD CD c c z ZZ Ul CECC • • c, • *Cw * * ocz coz oo XX 1*1 ZZUI ZZI o o z z Z Ul CD Si c c CE CC a. cl o c or a a. a. o o o o or or cr or a. a. CL CL o o or ac CLOL o o or or a. a. o o or or a. a. o o o or or a a. Cl Cl O c- or or cl a. o o or or Cl Cl 76 C •*. K. */■ •. • wO %/i • ^ — 2 C o %f *s *- w w »- • «_» • • KJ 2. X =) I I Z> • • e • . o -J _J c -J _J o - - i - - 2 « O ►- © — t- « « HMT — — u — — U. e o • • c • • c CCS C C 2 Z 2 UJ 2 2 w _i C\J UJ «. ^ \f\ « U- K- -c « O < O I O O _j o o »- — O V _J — o >- © o U.OO •a CO •« — O ~ C-«~ UJ ZO »-OUjOZO I — >UO — . t/1 O QD O UJO c r or a a. a. =. u. 7 ar »- •r < ► « r> < •- IV •* « ir 77 - - o. 78 X o < o c a. c * »_ o O © (J — c. T O C O C O « o o to JO u •- o a C CL O T *r wn Z « « J _J O —I Uj _J cn* *- « _i « _j yujmyki->-ov>-ozo«ooo>- WOffOOOO — k- »- O •- »- O ~ O _l O QD O >- O .Z — _i •- fsj *-«•-« _J ^OiTDO UJ »~ i^W •«•- «^-^0 — *- C ~- O »* Z — B « 0004 »-0*-0u,©«Z0Z00©* »~ o»— ok- — koowoc tro»-ujor>ooo _j — «/> *■ fcT W »/■. *« *r. w- - h. ■ V V V V a t- y ■ —a: _i « I I h. « CO •- z >■ »• z a v> « v> _i »- o uj n»uoo«»NO- ►- o * ->Zra»-IMV>IM»-CMK'IM»- U. V> l\J _J (\J OOZOOOV>OGO«OOOOCQV>0 « o oo»-ou>o«0€SO_iOe>o»-cii-«o >o (EC »- < < a: o. a. « Z u, U. I_J ■ •- «- UJ « • a • a z W H- LJ X G X UJ U. 9 O JO>Z*>(\>UM/>(\il- ■^-nziTKC UKK* ■> ou.ooooooa:o«oa:iijCEOxo»-o_ioi-o o © ~-© OOHOOkOL C ►- «n v? M VI v. v> vi V; VI v, tr vi V V V hi O _J _> -i >- o o « o o O Z « or z OD OD CD Ul at ct o X X X X « « »- >- >- *■ z z z »- V) VI vi a «9 (9 •-I o o o — •^ «-■ o K ►- »- v> VI V) z o o O v> V> V) o is a C9 •* « « o ooo o o o ooo oo 80 • ft. tt. a. mm mm m X OOO *- a X LJ «- m m « *- ■ x o x o — —a • ►- c\* ►- »n * * «zmKKOuiM/)oj - ^ *-. " - o a •■ b« « *»«_/<->*<_>•>* « «uj**a *«-« — «»_ 81 *~ in* ■— * *> c\j « — o ~ c— — H- O L_ O ZO O «»o«o UJO -. o o o *- tv^ « n 82 O — 9 — uzo •-co (Ay O XBC X Ul X I I o •- «/> *0 4/i ►- m x o. a. o *« C i3 XuiOD Z o — c\i ■— in «- ■ o u. u. o a. o a « x Z « C Ul O »- •« _l k O X — o «-> o o at <* o O 3 K O a ui u z O - u ui I »- _. ^ _ « « «««_i*A *r x _j x x > x x x x •- -J I I □ > ^ tr >~ -10.1 o — CD >- O U. i ^ O «- ik c o ^ X OJ X x u. a x u e l. » o x o UJ c. u. tr a »r »- ♦r * *■ A * *• t t t > t t t t t tit Z> Z! => Z> Z> ■=> Z> c o_ a. a a o_ a O 2 2 2 2 Z * a. — * tf" W ■ MHM t~> I— — 8U O « « CM — OK.-* -J I I o »- _ > _«l*l *1 *l _J ^ >*/)•« H- JO.0L o ■y X _j X X XXX X *- <~ OC. K- o _ .. fcT — . iO z z c C u> _i _i a _> i a o > */) c o _» x a. cd •-> UJ 2 X Ul X z o — «« c «s mxuai x ui ac x •-« _■ UJ t/1 — J — i IT X _ m : J* X 5* ►- _J X X C > *r *r *- J1Q.G •— a ►- o i/; —> x *- _i i a c =» l/ - C O J»1I •- u z >« — a, — z «- UJ « cr u. v. *- u ■a X I 1 - It. ►- 5* UJ a ►- cr u cr «a o «» o « u. a UJ a uj iS> 1/) •> M •• M «_ K K »- K U- 3 3 3 3 3 3 c cl a a. a a o z z z z z a ____,- Q. - * w. *r> tr v. %r. Or •- a. - * Ifl 86 o o c o o a «■ 3 I I C *n u- _j _ — i— i— »_ j _ txa.a.o x */> _* x x x x x xu_- z O Ui C C m. *Si *f> a c tr •— ~ •»*-- — z e v O u. ■- a >-«■.—■.. — — 3 - z> = D r< j c a a a a a a 3 T Z T 7 Z J a •- •- •- •- *■ ►> a. - » i/" *■ »^ *" */ V a. — u. O _l i/ - . 3 c l. u h I Hi •— *"1 o O CM 3 o o o «/) ou. 10 • at >- A «- •- => « O V 3 i- 87 o o o o o c — m «. w w w U. I _> £ c »- o -j X I I o _l «/• => c tr c o a */■ tr «/; t- -J 0. jyx l~ X mx a. x o X •- "-ax a: >« x u- a. ►- o u. \r, _i h- i a *- ^ ■* ■ iSl i- -J -J - X */> _J X J I -J -J z c 3 C la ►- »- I *- C ►- h- c a: ►- a: a X - ui a. -3 u i^ «/s UJ >/l 3 c u 3 _ u ►- *- X ►- c X c t= ►- X o z o a. D c oo u. H UJC a *- 1 < < u. «x 1/) a. <~> « <-> ~ o X * = => => 3 => a. o.a Q. a. 2* Z Z Z Z S8 c »- — ' sf C l£ - - I L -1 - ^ ac _i — >/> _, u: x „ i _l a. _. I I o _j o. a o — a ►- o x 2 a If c c X UJ tt U2 M z • *- Z *M a cr « < a a a u. 2 ^ C un ar •-» 3 «* *- a m ■* a • a u. C X 2 _i a *- <_- ►- CD UJ UJ « CT UJ ~ in. u <- — — o o o ■s c u. 3 ? » (- «I u. •- — I UJ IT X U. _J Uj o - \ri a i t i cr ««—«•• - iii a z u z ou C ~ O I I- «« I »- •— Ul *- JT « X COO ►- • *» * o jt « >- cr < c? o •- ►- 3 u u »- a. I •— >- ►- » «> »- c o o o u. © t- t- i a. — u. 3 U U •- a. i CD t- t ar. C - — •- I o ►- */i %/) ^ Z3 7 C O UJ 90 C — O (Nj n. «/) www >- i a. c •- «/> c ir «/) x a. x X U. Z M ^2 JU1 1 *- — >_■ %r. _l X > »s oO —> i/i — z _) c cr a u. «• i a >- ►- « •- - OX < z u> -~ o <-> «-> ~ Z> O ►- C */> « u •/> a a c »- */■■ IS. t/> — t z w c u 91 ~ — c r\ ^~ o o — n o — — =r k. »- a a o - *s c c wo x a. x X U Z X X z o u.' a x l- X -* o •-if CO it. xua x u. a; x —I 2 > a —i ar z a ia c X 0. z c c u- a u. 2 a. a x v ■- u. _l »/) «! C U ►- ^- a: c ►- « a « a a u I- < 4/) x t/ _ u t- — «j x •- jt >- IUO a o >- c a •- u. _l */> X U u »- a. x a. o u. X C _ b. I- • — I i— u~ < *- t- 3 K cy 4 tt ►- I- Q •- a .— — *_> O -* ~ o o o (AOy o »- x> o z o lAUJlw ■ I. z c - i a ■• • a 1*. c u L- a ^ J ►- BB »— »- u. « <^ u Or »- n ■* a a *J a a t/i fa 3 7 y I r- Ul i I »- I *" o o c. UJ r »- ■ ^ *r. — • a C u. it. 2 ►- I «J ►- m »- « i/i u. »- a c »— « a a a. •1 UJ a W; 3 i. u fr- ►— v C o er h- v> V r e: X a. VI X UJ CD vi UJ X u- X Lu H- X u.. oe X X v> X UJ cr X I C L. U t-l SI L) o t- T o li- oo u. *T> m ne (5 <9 3 A t « « M « a •* i « t z o V o ui *- a. ►- » CI ►- H- ►- »- ►- t- 3 3 3 3 3 3 3 o a. a a. o. a a. o 7 Z r z z z a a. - X v) an . — O — ^ r tv o <•» -D — iTI «a ^ ^ w v* u. -1 U. I 1 1- > Jl^ -, z c > C IS _l u. X •- o: x u. I I I O a «/". i/~ »/ »- x x a. o X u cc ►- ts 1 _. o >/- c <-- Xkll u a x «- c or c a »- c: a >- or OD « c a. « o 4 1 a X a Z >• U a V U Q. « U- >/l l« »/> i/: U 1/) C U 3 c u 3 _ 1. >- I >- >- I ►- O X c •- ^- o >- *- C t- (9 X CI X ■ - «t a o — a *v o uj o z «n ■ < t i «- ■ i_> I ■ " O O O m -Jlu wi irar — c jt ~ ~ 95 o o o o I -» o V c o X UJ CD ujcr x z — o •o o a xuo LIS X I/1CC— JUJUJ»-UJUJUjX ►-_!> _J « « « « _JI/> uiiA-i-i-i-i-i-i-io. tftX-JXXXXXX»- _1 1 I -> \r is. JLL t- I nt O «/J z « tlL 0. 3 Lu >- «« I OB < UJ »- >- cc O 2 < SOL. 0- — 3 L. OUIA 3 C ►- U 1- • * I X » «• o U k. O or — a z <_■ o z o z r> ODO ^ •- UJ z z o o t t t t t UJ t t ■ i t a. « -I ►- t- ►- ►- *- 3 z> 3 3 --> CL a. a. 0. 0. z z z T z 96 O m c «- i i~ 3 ^ a — «/- C 13 1/1 X UJ CO x u. a >* x z o XU(D ui or x i- iO X i — o •— t- IT */l C O kAUHXUD « wi x u- a x X — o i/> c «s M UJ CO uj or x «\j >- o u c = tr fr- ET *i -r « ■ U- pi 3 U •s C W f- fa c ► t- I 1 « I »- Ifl *- »- ■ */■ UJ « X UJ a p- u. or « o »- « or it « u. a rv ►- u i/i o «s> tf> u 3 c c u I »- • t- X ►- •— */i o *- Ct 9 a. — uj O _J V. T C U U ► > I •— *r M *- » • eg « ►- CD O Z < O C U ► • — a o • — a. o u. Z VI 3 C U »-t t*t o l/> c o xua u o: x X I I D 1/5 l/> »fl ►- x a. o- a UI IE ►- «5 X I I O x a. a. o UJ 00»- O UJ — G IE C (5 x a: x *- r*l .— 2 -IX 4 QUI I- ►- 0- o »— < a c\i »- Q. Cw O Ui o z x c c u o « a in a. cm Ui o _i •/> x c u u >- I CO X — UI «* >- « O Ui >- EK O Ot « (9 vn X •- U *- «/> x •■» Ktl- o Z - CUQ z or o a x - a. o «/> x O M) •/) m « o •si a. z> « o v X o o •- a: t- « a «_• * o •- * o — 0. « r> — 98 O *- o I z o «/> c o u. or x »- «/> I X — o OC •- »/) I/! C <3 IAUI K K Ul D xin x uj a x 1- 1 — o I/O X UJ CD x u or x U, 3 — O CD wo o o i/) xuo X Uj Or X C u. a or ex o 2 u IT U. •- U ^ 1 1/ K c U- 2 »- I CV K CD O < o H U; ►- 1 V or O */0 u_ -i or m. X a I 1* O m X ►- u H- iA I •- X i^ >- X c *r Or a or C ul — K t- 2 « O u- <-> e> wo ID o m o a « a « It Uj O O uo X C I- u »- I <-> J •- IJO « »- O « 3 o o c «/> I or a cr Uj 0, Ul a O 1/0 Ul (/) 3 •- u X = u X •- O X t- —J X ►- •— « t- I— < >- X U X » > «7 X o ir o o o o o o o UJ •- UlOI- Ul o t>- t- • ►- I < u « uo u. « «/) — *- <_> -.»«_> <- m. «J «» 73 «J =1 o o o o «« o u «o i or o — => « O V O X r> • O V _l 0. - 99 o o o o U. X — c tc a _l IT I I O «/) If t~ a. a. o m >- o i i o a. a. o Sl-O D r- O ^K -W «l/\0 UK «KKK»>0 z «-> ym jKuuuiyxi h-K >4 « « « JtO« VIUJJJ JJ JX1 X «/> — I X X X X Xb.CS r u a a a « L. a. 3 z u I- U. I •- I »- t'.' I— CL t/> U ID L U i- ►» a. —i i- t/> i- a. « a: •- o at- I Ui O O U. 1/5 I/) 1/5 M M 1-4 «-* ~ 100 o o o o O CO -* >w ^ C\J «» f- <» « — o a i — o IT. *T C O X X UJ (S u uj a x z- a c o Ui (D a x uj wi j H u uy u. t/>UJ _ I -J _ I —I —I -J X«/)_IXXXXX X I I o iT \T. *T t~ x a. a. o •-i (\< •- *- X o « o u. *- i ac C U9 »- < a « a a «* u- a */• 3 ? U *- Uj 2 *- I *- IlT K u a »- u 9- U < X — * .— a <-> O ^* T o UJ O C u in «/> cr O V o _l "" Ui %j ►- ►- •-»-»- z> D 3 O 3 3 o a a o. o. a o T r X, X X a a. — x *> *s t/J ^> *1 CE o ■* o Ul ►- I or O */> *- •« a m. a O- « UI a I/J 3 u I. »- «n I _l »- X N- u m o o o u. o *- I * w- Ui M «. O 101 z a -j - in •n O «S »- i/> in c <9 a> •« c «s cc m UJ ¥~ X UD Ul ►- X UJ 03 X X 1/5 X u, or x - o o o z. c ^ a. I cc »- 1- i/i • \n ? • 2 •o *™ »» IX 0- ►- UJ 3 « U cimz a. «») « w O Uj CO J C _ L. »- i a x " m c »- * «v "X »-» — o O l»> 3 o o o m o u. a « a. « or u O C o o a w o Ul a « o «* a z a Z a. « u. a. « u. UJ uj m X _' u X _ u ►- o X *- o X •- o h- »-i o ►- I M X PO c l*> m o o a o o o U) o »- UJ O *■ »- I ^ 1 « 1/) U) < in Ul -.«.<-> «. o «J => u =) 102 o o o o o O O rg o o ~ — t_ c c - X •- x — c d «■ c i; ^' - _ 1. x u. or v t/)or_iujujxxc ^JKUXXXXlIt-O «/■: c X UJ uj a z c a a. u. 3 ►- L. •- OX c *•> _l X a « a « l. 3 C - L, > « • <_> • 3 COO «• O i~ t-> a o X */: u i/i • t I •« T ■ ■• ►- 1 I J- M o a: ■« a K- — m > t^ ► *- i~ 3 Z> 3 • C Q a X o X 2 ■/> UJ 3 ~s • t I I I I I C ■* I ■ * * V V' ■» u- *-*->->-*->- 3 3 3 = 333 o a a a a a a O T r T r r * * \S V *S V~ *f \S »- Uf) J J J1LQ «IX JXXIKI9 X z o m c u utx ►- on x I — o lAUH K UJ O x i/> x uj a x •- x -> o a w> d e - ox ►- •- o»- Q O a o a l_ O O V) 3 C ►- L. >- I <-> I •-• 1/5 < ►- •- OX u a. i o X «*5 l/> Uj 3 «n • t t a « X t + i- t t z •- a a: — < ►- »- ►- t- 3 3 o a a. o z z a a. - X in u- — . -a: a. ►- — — o o o ft O L. a. t- a — a. - o o o X UJ CD 1/] X UJ CD M X UJ CD 1/3 X w OD lAXUS X U. cr X X uj BE X X Ik cr X X u. a X X UJ DC >■ .. — „ — .. ./I -. — i/> — . *Si — i/> 2 Z z z z C O c c c a _j • u. O. Li o •/> • u _l I o x I " u — e u o o o «/] O Uj •— a> o «j m D DOC to c u. — = coo l/V O UJ i/ i a « */* «J <"l z> o o c */i O UJ CI ■» z> t e t; v 3 * 105 O O -" a. c - _1 I •- Ul ►- 2 ■» 3 U I- KI <- Ul »- c « c « cc a a. Ul a u. M in s I U 3 • u i- ar x ^ a x •- o »- *- o ►- ~ •— -« cj U * 3 o o o ■A O Ul MIS .-. »• o u « => o o o «n o ui « i c »- » <_• • 3* O o It a. - ct •-.-•- a. - 106 k■ x a »- i - X •« O W^ C 19 K uj CD w a x i/)ft_iuju.ujxxo u« J J J JQ.1D I/) X _> X XX CD ►- C O ►« o u •- a. a x t/i 3 »- U t- ►- X •— ft ►- f ' -a cr »- a. ft « o x « ft • u uj a. ft • ui w) c x «/i I. J ♦ »- u I ►- •-> X t- ~ ft *- .-. • o < Q - O X o ^ U Ui i/> -or « X ft •- — — a. — •w *S %/ \S I J .- -~ ~ s o a 107 i i o u-> c o a mxus mKya ■/> X Ul ID -1 oc x u at x x uj ac x X Ul K X » Ui — — — — M mm. m — %n - • ►- « 1/5 UJ « Ul — « o «J • 2 o o o «o o u « i tr « tn »- » => « o a. o. u. o *n X • u »- -J X •- => »- o oo m o ui nit X X u •- K X •- Ul »- ■« V. u. O => o o w mtiT « c Z - B • »- 109 LIST OF REFERENCES [1] McKeeman, W. M., "An Approach to Computer Language Design", Technical Report CS-48 , Computer Science Department, Stanford University, Stanford, California, August, 1966. [2] Wirth, N-, and Weber, H., "EULER - A Generalization of ALGOL and its Formal Definition", Part I, Part II , Comm. ACM , 2 (Jan., Feb., 1966), pp. 13-25, 89-99. [3] Ingerman, P. Z., " A Syntax Oriented Translator ", Academic Press, Inc., New York, I966. [4] Brooker, R. A., and Morris D., "A General Translation Program for Phrase Structure Languages", J. ACM , ^ (Jan., 1962), pp. 1-10. [5] Trout, R. G., "A Compiler-Compiler System", Proc ACM 22nd National Conference , 1967, pp. 317-322. [6] Floyd, R. W., "A Descriptive Language for Symbol Manipulation", J. ACM , 8 (Oct., 1961), pp. 579-584. [7] Evans, A., "An ALGOL 60 Compiler", Annual Review in Automatic Programming , k (1964), PP- 87-124. [8] Feldman, J. A., "A Formal Semantics for Computer Oriented Languages", Carnegie-Mellon University , Pittsburgh, Pennsylvania, 1964. [9] Chomsky, N., "Formal Properties of Grammars", Handbook of Mathematical Psychology , Vol. 2, Luce, Bush and Galanter (Eds.), John Wiley and Sons, Inc., I963, pp. 323-4l8. [10] Earley, J. C, "Generating a Recognizer for a BNF Grammar", Computation Center Report , Carnegie-Mellon University, Pittsburgh, Pennsylvania, 19&5' [11] DeRemer, F. L., "On the Generation of Parsers for BNF Grammars: An Algorithm", ILLIAC IV Document No. 199 , Department of Computer Science, University of Illinois, Urbana, Illinois, 1968. [12] Floyd, R. W. , "Bounded Context Syntactic Analysis", Comm. ACM , 7 (Feb., 1964), pp. 62-67. UNCLASSIFIED Security Claaalflcatton DOCUMENT CONTROL DATA R&D (Security claaalllcatlon ol tltlu, body ol aba tract and Indamrng annotation mumt he entered when the overall report I* clemtllled I. ORIGINATING ACTIVITY (Corpormt* muthar) Department of Computer Science University of Illinois Urbana, Illinois 6l801 4*. NtPO" T 1ECUBI T Y CI AtllFIC A TlOf 2b. qkoup UNCJ.Afi.STFTfln J. WPOUT TITLE THE GENERATION OF A DETERMINISTIC PARSING ALGORITHM 4. DESCRIPTIVE NOTII (Type ol report and htclueire daft) Research Report V authorisi (Firm! mm, middle Initial, Jm niiuj Alan James Beals I RIPOUT OATC January 6, 1969 7a. TOTAL NO. OF PACE* 113 76. NO. OF REFS 12 M CONTRACT OR GRANT NO. .46-26-15-305 b. PROJECT NO. USAF 30(602)Ul)+4 ••. ORIGINATOR'S REPORT NUMBERIS) DCS Report No. 304 9b. OTHER REPORT NOIS) (Any othar number* that may ba maalgnad thla report) 10. DISTRIBUTION STATEMENT Qualified requesters may obtain copies of this report from DCS, II. SUPPLEMENTARY NOTES NONE 12. SPONSORING MILITARY ACTIVITY Rome Air Development Center Griffiss Air Force Base Rome, New York 13*A0 IS. ABSTRACT This paper describes an algorithm for the conversion of a grammar in the form of a set of BNF productions into a deterministic parsing algorithm as described by a set of modified Floyd productions. The algorithm is extended in such a way that it may easily become a part of a complete translator writing system and make use of the information available in the semantic part of such a system. The paper also includes a discussion of the implementation of the extended algorithm and describes potential related research. DD ,?~..1473 UNCLASSIFIED Security Classification UNCLASSIFIED Security Classif u-gtion k e v «o ma algorithm parsing BNF Floyd productions grammar translator writing system semantics syntax compiler UNCLASSIFIED Security Classification