■ 
 
LIBRARY OF THE 
 
 UNIVERSITY OF ILLINOIS 
 
 AT URBANA-CHAMPAIGN 
 
 no 
 
 510.84 
 .74G-75I 
 
 cop. 2 
 
The person charging this material is re- 
 sponsible for its return to the library from 
 which it was withdrawn on or before the 
 Latest Date stamped below. 
 
 Theft, mutilation, and underlining of books 
 are reasons for disciplinary action and may 
 result in dismissal from the University. 
 
 UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN 
 
 9Ut 1 3 iof| 
 
 1 2 RECTI 
 
 L161 — O-1096 
 
Digitized by the Internet Archive 
 in 2013 
 
 http://archive.org/details/interactivecompi748tind 
 
Tlt*> 
 
 ]U>- ^/^UIUCDCS-R -75-7^8 
 
 ;/ ! tcUi 
 
 IHl LIBRARY 01 ["HE 
 
 SEP 8 1975 
 
 UNIVERSITY OF ILLINOIS 
 AN INTERACTIVE COMPILE -TIME DIAGNOSTIC SYSTEM 
 
 BY 
 
 Michael H. Tindall 
 
Report No. UIUCDCS-R-75-7 1 +8 
 
 AN INTERACTIVE COMPILE-TIME 
 DIAGNOSTIC SYSTEM* 
 
 BY 
 MICHAEL H. TINDALL 
 
 October 1975 
 
 Department of Computer Science 
 
 University of Illinois at Urbana-Champaign 
 
 Urbana, Illinois 61801 
 
 • 
 
 This work was supported in part by an International Business 
 Machine Corporation educational fellowship and was submitted in 
 partial fulfillment of the requirements for the degree of Doctor 
 of Philosophy in Computer Science, October 1975. 
 
ABSTRACT 
 
 Let the following be a few of the characteristics of an 
 interactive compiler system: the user's program is interactively 
 entered via a keyboard and display-screen timesharing terminal 
 (e.g., the PLATO IV CAI system); the program is scanned and syntac- 
 tically parsed as it is input by the user; programming syntax errors 
 are signalled as soon as detected by the parser, and the user must 
 correct the errors before proceeding (this implies that no right 
 context of the user program is available for examination by the 
 error analysis system). 
 
 Under these assumptions, the following interactive diag- 
 nostic system can be devised. The system is "automatic," that is, 
 it is driven by the compiler's parser tables. The error system be- 
 haves like a consultant by suggesting "possible corrections" of the 
 program to the user, and at any time the user can proceed to fix 
 the program or request further suggestions. In addition, the 
 "possible correction" diagnostic suggestions can refer to not only 
 individual "tokens" in the user's program, but also higher-level 
 constructs, such as "expressions," "array bounds," etc. 
 
 Key Phrases: Compilers, error analysis, interactive com- 
 piling, parsers, transition diagrams. 
 
m 
 
 ACKNOWLEDGEMENTS 
 
 The author wishes to express his gratitude to his thesis 
 advisor, Professor Thomas Wilcox, for his many suggestions and 
 critical guidance throughout the work on this thesis. 
 
 Thanks are also in order for Professor Alan Davis and 
 J. Mike Milner for their assistance with the implementation of the 
 prototype diagnostic system on PLATO IV. A sincere thank you is 
 also given to the other members of the thesis committee: Professors 
 J. Nievergelt, H. G. Friedman, M. D. Mickunas, and E. M. Reingold 
 for their support. Professor Mickunas in particular, as well as 
 Professor W. Hansen are thanked for their helpful comments about 
 this thesis. 
 
 Finally, my wife, Elaine, must be thanked; it was only 
 through her patience and support that this work was completed. 
 
IV 
 
 TABLE OF CONTENTS 
 
 Chapter Page 
 
 1 INTRODUCTION 1 
 
 1.1 The Problem 1 
 
 1.2 The Compiling Environment 2 
 
 1.3 Compile- time Error System 
 
 Constraints and Goals 5 
 
 1.4 Organization of the Thesis 8 
 
 2 SURVEY OF ERROR CORRECTION AND 
 
 RECOVERY TECHNIQUES 10 
 
 3 THE TRANSITION DIAGRAM PARSING MODEL 20 
 
 3.1 Introduction 20 
 
 3.2 General Characteristics of the 
 
 Transition Diagram Model 20 
 
 3.3 Formal Description of the 
 
 Transition Diagram Model 29 
 
 3.4 Description of the Implemented Parser. . . 30 
 
 3.5 Compiler Environment when Parser 
 
 Signals Error 32 
 
 4 A MODEL FOR DIAGNOSTIC INTERACTION WITH 
 
 THE PROGRAMMER 34 
 
 4.1 Goals for Diagnostic Interaction .... 34 
 
 4.2 Suggesting "Possible Corrections" for 
 
 the Programmer 35 
 
 4.3 Algorithmic Notation 42 
 
 4.4 Algorithm for Diagnostic Interaction ... 42 
 
 5 ' DETERMINING "POSSIBLE CORRECTION" SUGGESTIONS. . 44 
 
 5.1 Introduction 44 
 
 5.2 A First Model for Determining 
 
 "Possible Correction" Suggestions .... 44 
 
 5.3 Extended Suggestion Model 63 
 
 5.4 Overall Control of the Extended Model. . . 88 
 
 6 CONCLUSIONS AND FUTURE RESEARCH 93 
 
 6.1 Summary 93 
 
TABLE OF CONTENTS (Continued) 
 
 Chapter Page 
 
 6.2 Evaluation 95 
 
 6.3 Future Research 101 
 
 6.4 Conclusion 102 
 
 LIST OF REFERENCES 103 
 
 APPENDIX 
 
 I GENERAL OPTION TREE EXAMPLES 106 
 
 II EXAMPLES OF "POSSIBLE CORRECTION" SUGGESTIONS 
 
 FOR PL/I 125 
 
 VITA 166 
 
VI 
 
 LIST OF TABLES 
 Table Page 
 
 5.1 Token Options for States of Transition 
 
 Diagram El 59 
 
 5.2 Error Interpretations for Transition 
 
 Diagram El 60 
 
 5.3 Token Options for States of Transition 
 
 Diagrams E2 and E3 61 
 
 5.4 Error Interpretations for Transition 
 
 Diagrams E2 and E3 62 
 
 5.5 General Options for States of Transition 
 
 Diagrams E2 and E3 81 
 
 5.6 Extended Error Interpretations for 
 
 Transition Diagrams E2 and E3 82 
 
 5.7 General Options for Diagrams E6 - E9 86 
 
 5.8 Some Extended Error Interpretations for 
 
 Diagrams E6 - E9 87 
 
vn 
 
 LIST OF FIGURES 
 
 Figure 
 
 Page 
 
 3.1 - Simple transition diagram example 23 
 
 3.2 - PL/I "GOTO" statement transition diagram ... 23 
 
 3.3 - Invoking transition diagram example 25 
 
 3.4 - PL/I "IF" statement transition diagram .... 26 
 
 3.5 - Multiple return transition diagram example. . . 27 
 
 3.6 - PL/I conditional expression example ..... 28 
 
 3.7 - PL/I <conditional-expression> 
 
 transition diagram 29 
 
 4.1 (a)-Example of suggestion messages 36 
 
 (b)-Example 37 
 
 (c)- Example 37 
 
 (d)-Example 38 
 
 4.2(a)-"Detail" suggestion messages example 39 
 
 (b)-"Detail" example 40 
 
 (c)-"Detail" example 40 
 
 (d)-"Detail" example 41 
 
 (e)-"Detail" example 41 
 
 4.3 - Algorithm ' USE ^INTERACTION' 43 
 
 5.1 - The "transit" function . 52 
 
 5.2 - "Token" MODIFY routine 57 
 
 5.3 - REPARSE routine 58 
 
 5.4 - MODIFY routine, version 2 69 
 
 5.5 - Final USER INTERACTION routine 75 
 
VI 11 
 
 LIST OF FIGURES (Continued) 
 
 Figure Page 
 
 5.6 - Final REPARSE routine 76 
 
 5.7 - Final MODIFY routine 78 
 
 5.8 - Final MODJTREE routine 80 
 
 5.9 - Transition diagrams E6 - E9 84 
 
 5.10 - ERROR SYSTEM CONTROL routine 92 
 
1 
 
 Chapter 1 
 INTRODUCTION 
 1 .1 The Problem 
 
 Novice computer programmers, such as students taking an 
 elementary computer programming course, spend a substantial frac- 
 tion of the total time required to develop a computer program 
 removing compiler-detected syntactic errors. The compilers used by 
 these elementary programmers must provide informative and under- 
 standable diagnostic messages when compile-time errors are 
 discovered. 
 
 At the present time, there are a few compilers available 
 that operate in a batch-mode environment and provide reasonably 
 good diagnostics (for example, PL/C (Conway and Wilcox [Con., '73]) 
 and WATFIV (Cress, et al. [Cre., '70]). However, for a compiler 
 that operates in an interactive timesharing environment, that is, 
 an incremental compiler, the diagnostic assistance that can be pro- 
 vided to an elementary programmer can be much more informative than 
 is possible in a batch-oriented compiler. It is possible for an 
 interactive diagnostic system to provide various "levels" of error 
 messages about a particular error situation; some messages may sug- 
 gest fairly general ways of fixing the program, while others 
 provide very explicit detail about possible corrections of the 
 
error. Using the fact that the programmer is preparing the pro- 
 gram via an interactive computer terminal, the diagnostic system 
 can present error messages to the programmer and, based on the pro- 
 grammer's response, tailor subsequent, more detailed or refined 
 error messages to the comprehension level of the particular 
 programmer. 
 
 This thesis describes a syntactic error diagnostic system 
 that is intended for use in a highly interactive, syntax table- 
 driven compiler system. The primary goal of this diagnostic sys- 
 tem is to provide good diagnostic messages in a form and terminol- 
 ogy that an elementary programmer can understand. It is assumed 
 that the novice programmer is compiling the program on an interac- 
 tive timesharing computer and compiler; therefore, the programmer 
 can, to some extent, control the diagnostic messages that are pre- 
 pared and displayed. In addition, the error analysis algorithm 
 makes complete use of the syntax tables, and thus any error analy- 
 sis is performed "automatically" by the compiler system with very 
 little additional work required of the language designer beyond 
 specifying the syntax table itself. 
 
 1.2 The Compiling Environment 
 
 Recently, a project at the University of Illinois at 
 Urbana-Champaign has been under way to automate the teaching of the 
 basic computer science courses by using the PLATO IV computer-based 
 
education system [Nie., '74]. The curriculum that has been imple- 
 mented teaches elementary computer science programming language 
 concepts and constructs to classes of students with highly-varied 
 backgrounds. Specific detail on a variety of languages (e.g., 
 PL/I, FORTRAN IV, COBOL, BASIC) is available; students progress at 
 their own speeds through a fairly flexible course structure [Ela., 
 '75]. An important part of the system is an on-line interactive 
 compiler in which a student can easily and conveniently try out new 
 programming language constructs immediately after learning about 
 them in an instructional lesson [Wil., '73]. 
 
 A number of design criteria for a compiler system emerge 
 from examining this interactive environment. First, the compiler 
 should be as interactive as possible to utilize the PLATO IV system 
 effectively and to maintain a desirable computer-aided instruction- 
 al environment for the student. To accomplish this, the compiler 
 compiles character-by-character; that is, each single key pressed 
 by the student using the compiler is examined immediately as the 
 student types it in. Thus, the compiler keeps up completely with 
 the student's preparation of a program. 
 
 Another result of this immediate, incremental compiling 
 technique is that syntax errors are detected and signalled as soon 
 as possible by the system. The student must correct the error be- 
 fore being allowed to enter the remainder of the program. 
 
 The student is able to edit the program by moving a 
 
cursor through the program on the PLATO graphics screen. The com- 
 piler moves with the cursor, compiling when the cursor moves for- 
 ward in the program, and backing-up ("uncompiling," i.e., resetting 
 the lexical and syntactical analyzers to previous states) when the 
 cursor moves backward in the program. To facilitate the "uncompil- 
 ing" capability, a parser-action history is created and maintained 
 as the program is compiled; this history records any changes that 
 are made in the parser's environment for each token as it is parsed. 
 Then, to "uncompile" a token, it is necessary to undo only changes 
 recorded in the history for that token. This parser history is 
 also very useful for the error diagnostic system, since it contains 
 a complete description of what actions the parser system has taken. 
 
 A second design criterion for the compiler is that it be 
 multilingual. To accomplish this, the compiler is completely 
 table-driven; to allow another language to be recognized by the 
 compiler system and used by students, a language designer must 
 merely fill in a new set of tables and provide an execution super- 
 visor system for the actual interpretive execution of compiled 
 programs. 
 
 A third design criterion for the compiler is that it pro- 
 vide a high and sophisticated level of error diagnostics for the 
 student. Since the intended users of the compiler are beginning 
 student programmers, the error messages must be direct and to-the- 
 point. A system has been designed and implemented to handle errors 
 
that are detected during the interpretive execution of a student's 
 program [Dav., '75]. The present thesis is concerned with diagnos- 
 ing errors that are detected during the compiling phase of program 
 preparation. 
 
 1.3 Compile-time Error System Constraints and Goals 
 
 One very severe constraint for the diagnostic system is 
 that any messages and algorithmic analysis must reference only 
 those tokens in the program up to and including the token at which 
 an error was detected. In an interactive timesharing environment, 
 unlike any batch-oriented compiler environment, right context for 
 an error is not available for examination most of the time. This 
 is because the interactive compiler detects and flags syntactic 
 errors as soon as the programmer types in an offending token. The 
 only time when any right context for an error is available is the 
 situation where the programmer enters a part of the program into 
 the system and then moves the cursor backward in the program 
 ("uncompiling" the tokens that are backed over); if the programmer 
 then attempts to make a syntactically illegal modification some- 
 where in the middle of the program, all of the backed-over tokens 
 are available as right context for the error. However, since the 
 programmer was in the middle of making some change to the program 
 when the error was detected, it is unlikely that the right context 
 has anything to do with the newly-formed left context of the error; 
 
the input string will usually be in a transitory, incomplete form. 
 Therefore, the general, more restrictive case that must be consi- 
 dered is where no right context is available for analysis. 
 
 Another constraint is that the error system must operate 
 "automatically" and be as language independent as possible. To 
 achieve this independence, the error analysis system must be able 
 to determine any messages by examining only the input token string, 
 any compiler symbol tables, and the parser table. To keep the com- 
 piler system multilingual, no language-specific, ad hoc error anal- 
 ysis techniques can be used. 
 
 There are a number of goals for this compile-time error 
 diagnostic system. One is that all diagnostic messages be under- 
 standable to elementary student programmers. In particular, all 
 messages must be stated in the terminology that is associated with 
 the particular programming language being taught to and used by the 
 student. This means that if things like "statements," "arithmetic 
 expressions," and "array bounds" are important concepts in a pro- 
 gramming language, then diagnostic messages should refer to these 
 higher-level constructs when appropriate; however, the actual spe- 
 cific detail pertaining to the higher-level constructs should 
 always be available to the programmer if it is needed or requested. 
 
 Another goal for the diagnostic system is that it must be 
 highly interactive and responsive to the student. As previously 
 mentioned, a high level of interaction will enable the diagnostic 
 
system to be more effective for the student than would be possible 
 with a noninteractive system. In addition, all of the instructional 
 lessons that a student takes on the PLATO IV system are highly 
 interactive and require student input frequently to be completed 
 successfully. The overall compiler system that has been described 
 is very interactive and is constantly monitoring the progress of 
 the student as a program is being prepared. Therefore, the diag- 
 nostic system should also be interactive in order to conform with 
 the rest of the student's environment. 
 
 A further goal for the diagnostic system that it must be 
 "feasible" for implementation within the PLATO IV system. The 
 PLATO IV system (Alpert and Bitzer [Alp., '70]) was designed as a 
 large-scale computer-aided instruction system to support as many as 
 1000 simultaneous users. The student terminal contains a plasma 
 display panel as its main output component and a keyboard as its 
 input component. The panel permits quick display of both textual 
 information and arbitrarily complex graphic output. (This capabil- 
 ity can be used very effectively in conveying useful diagnostic 
 information about an error to a student.) However, because of the 
 large number of users allowed by the PLATO IV system, the amount of 
 central processor computing time allowed per user is very restric- 
 ted. Therefore the error analysis algorithms used to determine 
 diagnostic messages for the student must be fairly efficient and 
 not require too much central processing time. 
 
The overwhelming goal and emphasis for this diagnostic 
 system is that it behave like a "consultant," that is, be able to 
 diagnose programming errors and help the student understand the 
 errors. As much of this "consultant" as is possible should be 
 built into the compiler system and be provided automatically for 
 different programming languages. It is the activities and opera- 
 tions of this "consultant" with respect to syntactic and context- 
 sensitive semantic errors that is the topic of this thesis. 
 
 1.4 Organization of the Thesis 
 
 Chapter 1 has presented an overview of the problem of 
 providing good syntactic error diagnostics in an interactive com- 
 piling environment. The characteristics of a particular student- 
 oriented interactive compiler system were discussed and a few of 
 the goals and constraints imposed on a diagnostic system by such a 
 compiler system were mentioned. 
 
 Chapter 2 will present a survey of some of the previous 
 work that has been done on error correction and recovery techniques, 
 Chapter 3 gives a description of the transition diagram parser 
 model that is used by the compiler system and on which the proposed 
 diagnostic consultant system is based. 
 
 Chapter 4 presents a model of an interactive diagnostic 
 system that is appropriate for use in an interactive compiler. 
 Chapter 5 then describes the algorithms and operations of the 
 
proposed error analysis system, and some final concluding remarks 
 are given in Chapter 6. 
 
10 
 
 Chapter 2 
 SURVEY OF ERROR CORRECTION AND RECOVERY TECHNIQUES 
 
 This chapter will present a brief survey of a number of 
 papers in the area of syntactic error detection, correction and 
 recovery. Other good surveys appear in the papers by Rhodes [Rho., 
 '73, pp. 9-25] and LaFrance [LaF., '71, pp. 2-7], and LaFrance's 
 paper contains an unusually complete bibliography of the relevant 
 literature prior to 1971. 
 
 It is important to note that each of the following sys- 
 tems was intended for batch-oriented analysis, as opposed to an 
 interactive timesharing analysis. As a result, the goal of each 
 system is to properly diagnose the syntactic error, or at least 
 make a guess as to the cause of the error, and then attempt to 
 somehow "correct" the user's program by modifying the input string 
 or perhaps adjusting the parser's stack appropriately, and thus 
 allow the compiler to "recover" from the error and continue parsing 
 the remainder of the program. 
 
 As pointed out by Rhodes [Rho., '73, pp. 10-11], one of 
 the more common error analysis techniques that is used, particu- 
 larly in earlier systems, implements the so-called "panic mode" of 
 error recovery. With varying degrees of sophistication, the "panic 
 mode" recovery operates by popping the stack and advancing the in- 
 put until a "safe" environment is achieved. This method is simple 
 
11 
 
 to use, but the errors contained in that portion of the text which 
 is skipped are not detected. The emphasis of the "panic mode" 
 method is simply to provide a technique which allows the parser to 
 recover from the error and continue parsing more of the input 
 string. 
 
 One of the earliest papers on an automatic error system 
 came from E. T. Irons [Iro., '63]. Irons' parsing algorithm con- 
 structed all possible legal parses for the input string in parallel. 
 An error was signalled when none of the parses could be extended 
 one more symbol. In this event, Irons' error system built a list 
 of all the possible successor symbols from each parse directly be- 
 fore the error point. Then the input was discarded until a symbol 
 was found that appeared on the list. Finally, a string of symbols 
 was created that, when inserted at the error point, would allow the 
 parse to continue. This achieved the effect of local insertion, 
 deletion or replacement of symbols to repair the error. 
 
 A more sophisticated version of the "panic mode" method 
 was proposed by Leinius [Lei., '70]. This work was primarily based 
 on a simple precedence parser [Wir., '66]. The recovery system 
 attempts to find the smallest substring containing the error which 
 can be reduced, and then forces the appropriate stack reduction and 
 input scanning to occur. 
 
 Leinius also described the application of his technique 
 to other parsing systems, and a master's thesis by L. R. James 
 
12 
 
 [JaL., '72] discusses a detailed system using a bottom-up LALR(k) 
 (lookahead LR(k)) parser [DeR. , '71]. As the parse progresses, 
 "configuration sets" are produced; these configuration sets consist 
 of the right-hand sides (RHS) of productions in the grammar (see 
 [Aho, '72] for a description of grammars, productions, terminal and 
 nonterminal symbols), with the following transformation: each RHS 
 contains a distinguished symbol not appearing in the language 
 (e.g., ".") which indicates how far in the RHS the parse has pro- 
 gressed as of this configuration set. As input terminals are read, 
 the configuration set is updated by moving the "." one symbol to 
 the right in the elements which apply. When the "." is moved so 
 that it is directly to the left of a nonterminal in a production, 
 then this configuration set is pushed onto a state stack and the 
 configuration set for that new nonterminal is used. When the "." 
 reaches the far right of a RHS in the configuration set, reductions 
 on the stack can take place and the elements in the configuration 
 set which used the reduced symbol are updated. 
 
 An error occurs when no RHS in the configuration set 
 applies to the input symbol. The error algorithm attempts to re- 
 cover as follows: the parser state stack is popped and the input 
 string is skipped until a configuration set is found which is as- 
 sociated with a nonterminal that can immediately preceed the next 
 terminal symbol in the input. Finally the configuration set is 
 updated to indicate that the nonterminal has now been parsed, and 
 
13 
 
 the normal parse continues. This algorithm will always terminate 
 since the entire program can be considered as one large string 
 which reduces to the start symbol of the grammar. 
 
 In the last few years, a number of error systems have 
 been proposed that emphasize the "correction" of the error that is 
 made, as well as the recovery of the parser. The systems generally 
 attempt to examine the input tokens that are located near to where 
 the error was detected and use this "context" as the basis of the 
 correction that is made to the program. An interesting technique 
 based on a simple precedence parser is proposed by Graham and 
 Rhodes ([Gra., '73] and [Rho., '73]). As input symbols are read, 
 they are either immediately stacked, or they cause the stack to be 
 reduced and then they are stacked, depending upon the precedence 
 relation between the input symbol and the symbol on top of the 
 stack. Two types of errors can be detected in this system: either 
 no precedence relation holds between the input symbol and the sym- 
 bol on top of the stack (called a "character pair" error), or else 
 a potential RHS is found, but no matching RHS exists in the grammar 
 (called a "reduction" error). 
 
 When an error is detected, the error system begins phase 
 one (the "condensation" phase) of two phases (the other is the 
 "correction" phase). The "condensation" phase marks the location 
 of the error in the stack, for later reference, as 1 ? . Then a 
 "backward" move is initiated by assuming that a > holds between the 
 
14 
 
 symbol to the left of 1 ? and the symbol to the right of 1 ? ; this 
 causes the stack to reduce if possible. Then the assumption is 
 made that a < relation exists between the two symbols, and a "for- 
 ward" move begins by passing control back to the parser. The 
 "forward" move will end when another error is detected. (This may 
 be a new error, or the reappearance of the old error in the form of 
 a reduction error.) This new stack location is designated as 2 ? , 
 and a "backward" move stack reduction is again performed. This 
 ends the "condensation" phase. 
 
 The "correction" phase then attempts to change the par- 
 sing stack to correct the error situation. To do this, the 
 "correction" phase will either replace or delete one of three sets 
 of symbols on the stack: the set from the first < relation to the 
 left of 1 ? up to 1 ? , the set from that same < relation up to 2 ? , or 
 the set from 1 ? up to 2 ? . If it replaces one of these sets, it 
 will replace it with a complete RHS from the grammar. The actual 
 choice of what action to take is determined by a probabilistic pat- 
 tern match, along with two "cost" vectors, which give the "cost" of 
 inserting or deleting a given symbol from the input string. The 
 total cost of deleting/replacing each symbol set is calculated and 
 the cheapest is selected. (Note that the deletion/replacement must 
 in turn produce a correct RHS on top of the stack.) The actual 
 choice algorithm can be experimented with to yield the best results, 
 as well as the entries in the "cost" vectors. 
 
15 
 
 A system implemented by Lyon [Lyo., '74] operates in the 
 following way. As the parser progresses, it constructs all possi- 
 ble parses: this includes all legal parses as well as all possible 
 error parses. For example, not only is a possibly legal parse 
 maintained, but also for each input token that is read, the parser 
 carries along a parse that assumes that the token is an extra in- 
 serted symbol, as well as many parses that assume that some other 
 symbol was deleted from in front of that token. In each parse, a 
 count is kept of the number of errors that have occurred for that 
 parse. When the input is exhausted, the parse which has the smal- 
 lest error count is selected as the final parse; this algorithm is 
 known as a least-errors recognizer. The results from Lyon's analy- 
 sis indicate that his algorithm uses excessive amounts of execution 
 time and storage space, thus relegating it, in its present form, to 
 a theoretical rather than practical category. 
 
 Lyon's implementation is actually based on a system pro- 
 posed by J. P. Levy [Lev., '71]. However, Levy's proposal is 
 somewhat different from the system that Lyon developed. Levy de- 
 scribes a technique for moving back in the input a certain distance 
 when an error is discovered, and then reparsing forward towards the 
 error point as described in Lyon's algorithm; some heuristic rou- 
 tines then select one of the error reparsings to correct the pro- 
 gram. It seems likely that this technique will eliminate the 
 combinatorial problems encountered by Lyon's system. 
 
16 
 
 An error system was proposed by J. E. LaFrance in 1970 
 ([LaF., '7 0] and [LaF., '71]). His system is based on a Floyd Pro- 
 duction System parser [Flo., '63]. Briefly, this parser contains 
 groups of productions. As the parse progresses, the parser moves 
 from production group to production group; each group contains 
 references to a set of productions that should be attempted at this 
 point. If none of the productions apply, the error condition is 
 raised. LaFrance' s error system then generates a list of all pos- 
 sible two or three symbol "expected" strings from the production 
 groups, beginning at the error point. Then each of these "possible" 
 strings is compared (matched) to the next four symbols actually in 
 the input string. The matching process consists of about twenty 
 different comparisons such as "assume one symbol is missing" or 
 "assume two symbols are interchanged." If a match is found, the 
 source input string is repaired appropriately and the Floyd parser 
 restarted. If no match is found, then the input string is discar- 
 ded until a symbol is found which can follow a symbol in the stack; 
 the stack is then reduced and the Floyd parser starts up at the 
 place where it should be when the stack has just been reduced to 
 that symbol. LaFrance 's system is very good in a number of ways. 
 It is a relatively efficient technique, since suitable encodings 
 of the "possible" and the actual input strings allow the matching 
 algorithm to be implemented efficiently. Also, the results of 
 LaFrance' s work indicate that corrections that are made to the 
 
17 
 
 program are frequently the same corrections that the programmer 
 would make in a similar situation, particularly for those error 
 system corrections that were determined by a successful match 
 having been found. However, if no match is found, LaFrance's sys- 
 tem reverts directly back to the "panic mode" form of error 
 recovery, with a corresponding degradation in the appropriateness 
 of the corrections performed. Another criticism of LaFrance's 
 system is that it examines only a few of the input symbols near 
 the location of the error symbol, and ignores all the other context 
 farther away from the error symbol. Examples can be given (for 
 example, Rhodes [Rho., '73, pp. 25-26]) of error situations that 
 occur in programming languages that cannot be properly corrected by 
 LaFrance's technique. 
 
 One final system that should be mentioned is that of 
 James and Partridge [JaE., '73]. They produced a system that uses 
 many of the analysis techniques suggested by LaFrance, but with a 
 strong adaptive slant: their system is designed to be able to mod- 
 ify itself somewhat to become more efficient or more effective. 
 Their parser utilizes a binary tree structure, where each node in 
 the tree contains some information plus a success and a fail link. 
 The parser moves through the tree as the input is read, comparing 
 the input symbol with the current node information and either tak- 
 ing the success or the fail link to another node. The system can 
 restructure its own binary parse trees so that the most used path 
 
18 
 
 occurs first in the tree. An error is detected if a fail is indi- 
 cated, but the current node has a null fail link. This invokes the 
 use of a separate "strategy" tree for error analysis. The strategy 
 tree contains information nodes that suggest different matching 
 strategies (very similar to those suggested by LaFrance). The 
 system can also restructure the strategy tree so that more effective 
 match strategies come first. 
 
 In summary, all of the techniques mentioned in this chap- 
 ter were designed to operate in a batch-mode environment, where 
 tokens both to the left and to the right of an error token are 
 available for examination and analysis. Because of this, none of 
 the systems can be directly applied in the interactive compiling 
 environment described in section 1.3 of this thesis in which no 
 right context is available for analysis. The "panic mode" systems 
 are particularly ill-suited, since they operate primarily by skip- 
 ping forward in the input until a token is found that can occur some- 
 place after the actual error token. LaFrance' s system (and also 
 James and Partridge's system, since it is based on the LaFrance tech- 
 nique) is likewise poorly suited for this restrictive interactive 
 environment since it analyzes the next few tokens following the 
 error token. The works of Rhodes and Levy, however, are somewhat 
 better suited to this environment, since both techniques begin by 
 examining a substantial portion of the left context before the cor- 
 responding right context is analyzed. The work of Levy, which had 
 
19 
 
 a large influence on the current thesis, is further described in 
 section 5.2.1 . 
 
 This concludes a look at some of the previous work in the 
 field of error correction and recovery systems. 
 
20 
 
 Chapter 3 
 THE TRANSITION DIAGRAM PARSING MODEL 
 
 3.1 Introduction 
 
 The syntactic analyzer used in this interactive compiler 
 is based on the transition diagram systems first introduced by 
 Conway [Con., '63], and later formalized by Lomet [Lorn., '73]. 
 The second section of this chapter will describe the most important 
 characteristics and operations of a transition diagram parsing 
 model; the third section will present a brief, more formal descrip- 
 tion of the parsing model defined by Lomet; the fourth section will 
 discuss the actual parser table implementation that is used in the 
 compiler system and which is referred to by the automatic error 
 diagnostic system; the fifth and final section will summarize the 
 parsing environment that exists when a syntactic error is detected. 
 For more detailed information, the reader is referred to the above 
 papers by Conway and Lomet, as well as a paper by Tindall [Tin., 
 •75]. 
 
 3.2 General Characteristics of the Transition Diagram Model 
 
 A transition diagram parser can be thought of as an expan- 
 ded (stack-oriented) version of a simple finite state machine (fsm), 
 A fsm maintains a "node" or "state" pointer, that is, at any time, 
 
21 
 
 the fsm is said to be in some "state" waiting for the next input 
 symbol to come in. When the next input symbol has been read, 
 based on the attributes of the new symbol, the fsm then makes a 
 "state transition" to a new state and waits for the next input sym- 
 bol to be read. An entire input string is accepted if the input is 
 exhausted and at the same time the fsm is in one of a set of special 
 "final" states. 
 
 Similarly, the action of a transition diagram parser is 
 to examine the "possible" or "acceptable" parsing options (deter- 
 mined by the current STATE and the transition diagrams) along with 
 the current input token. Based on this information, the parser 
 will accept the token by updating the STATE information and asking 
 for a new input token, or reject the current input token and signal 
 a syntactic error if the token does not satisfy any of the available 
 parsing options. 
 
 Note that for the transition diagram model being de- 
 scribed, all of the parsing options are defined in terms of 
 "tokens" only, that is, the tokens that are accumulated and output 
 from a lexical analyzer. This convention will remain the same for 
 all of the work that is reported in this thesis unless otherwise 
 stated. The parsing of input strings described in the rest of this 
 chapter, as well as all of the error analysis manipulations dis- 
 cussed later in the thesis are all based on a "tokenized" input 
 string. 
 
22 
 
 The actions associated with this transition diagram model 
 can be conveniently shown graphically as follows: Let 
 
 denote STATE "SI"; each branch out of a STATE corresponds to a pos- 
 sible syntax parse option for that STATE: these branches will 
 always be labeled with their particular syntactic option require- 
 ments. 
 
 Figure 3.1 is an example of a simple transition diagram 
 El. This is interpreted as: if the parser is in STATE SI and the 
 current input token is "a," make the state transition to STATE S2. 
 Otherwise, if the input token is "b," make the transition to STATE 
 S7. Otherwise, an error has been detected, and the parsing error 
 condition should be signalled. STATE S6 is a "final" state for 
 diagram El, as indicated by the null branch leading out of the S6 
 node: if the parser is in STATE S6 and the input is exhausted, 
 then the entire input string is acceptable to the parser. The ac- 
 ceptable input strings for this transition diagram are "abcde," 
 "ade," and "bfe." 
 
El 
 
 23 
 
 ^S)J 
 
 Figure 3.1 - Simple transition diagram example 
 
 A simple practical example is a form of a PL/I "GOTO" 
 statement, which is shown in transition diagram form in Figure 3.2. 
 Note that an input token can be referred to either literally or as 
 some other combination of attributes such as [label -name], where 
 [label -name] has an appropriate meaning in the parser environment. 
 
 "GOTO" 
 
 'GO 
 
 ■^y 
 
 TO" 
 
 [label-name] 
 
 <H>^-©- 
 
 Figure 3.2 - PL/I "GOTO" statement transition diagram 
 
24 
 
 So far in this discussion, all of the examples have been 
 completely parsable within the framework of a simple finite state 
 machine. However, the actual transition diagram model allows the 
 label on a branch out of a node to refer to not just a single input 
 token, but also a group of tokens that are specified by some other 
 transition diagram within the parser system. 
 
 Figure 3.3 is an example of this more complex form of a 
 transition diagram. In this case, if the parser is in STATE S2 in 
 transition diagram E2, then the parser will save STATE S2 on a 
 stack in the parser's environment and refer to diagram E3 to deter- 
 mine the attributes of the next acceptable input token. When the 
 "final" state for diagram E3 is reached, the parser will then have 
 completely accepted the input string specified by E3, and there- 
 fore, upon popping STATE S2 off of the parser's stack, will be able 
 to legitimately make the state transition to STATE S3 and continue 
 to parse as normal. 
 
25 
 
 E2 
 
 E3 
 
 Figure 3.3 - Invoking transition diagram example 
 
 Although not explicitly illustrated by Figure 3.3, the 
 "called" transition diagram (E3 in Figure 3.3) could refer to other 
 transition diagrams in the system, or could in fact refer to itself. 
 (However, E3 cannot be referred to from STATE S6, that is, "left 
 recursion" must be avoided.) 
 
 A practical example that requires the invoking of other 
 transition diagrams is the PL/I "IF" statement, shown in Figure 
 3.4. Note also that this transition diagram system can accept an 
 entire "IF" statement either with or without the "ELSE" clause. 
 
26 
 
 cT^ "IF" ( £2 ) conditional -expr> 
 
 S3) jaaaL: « 
 
 ^ <statement> /^TN "ELSE" /^N <statement> 
 
 €H 
 
 Figure 3.4 - PL/I "IF" statement transition diagram 
 
 One more example is needed to illustrate another impor- 
 tant feature of transition diagrams: a transition diagram system 
 which has been invoked by another transition diagram has the capa- 
 bility of returning to one of a number of possible STATES in the 
 invoking transition diagram. Figure 3.5 is an example of this 
 situation. If the parser is in STATE SI, then diagram E5 is in- 
 voked normally using the parser's stack, and either input "a" or 
 "c" is accepted, moving to STATE S7 or S8 respectively. Note that 
 the parser is now in a "final" state for diagram E5; however, the 
 small number given with the final state indicator specifies that 
 it is possible to return to two different STATES in the invoking 
 
27 
 
 diagram E4. These two return points are specified by the Rl and 
 R2 labels following the invoking of diagram E5. When the parser 
 returns from diagram E5, it will either return to STATE S2 or STATE 
 S4, depending on which return is specified. Thus, transition dia- 
 gram E4 accepts both "ab" and "cd" as valid input strings, but does 
 not accept either "ad" or "cb." 
 
 E4 
 
 E5 
 
 SI 
 
 <E5> 
 
 "c" 
 
 Rl 
 
 S7 
 
 S8 
 
 S2 
 
 S3 
 
 ' s4 V^ 1 S5 
 
 Figure 3.5 - Multiple return transition diagram example 
 
 A good example of where this multiple return feature is 
 necessary arises in trying to parse a (simplified) PL/I conditional 
 expression as shown in Figure 3.6. 
 
28 
 
 simple expression parentheses 
 
 ( ( ( I + 100 ) * J ) = K ) 
 
 conditional expression parentheses 
 
 Figure 3.6 - PL/I conditional expression example 
 
 When the initial left parentheses "(" are examined, it is not known 
 whether they are part of the overall conditional expression or are 
 part of the inside simple expression on the left side of the rela- 
 tional operator. A transition diagram for a <conditional- 
 expression> is shown in Figure 3.7. Note that the way in which 
 this <conditional-expr> has been drawn corresponds to assuming that 
 the initial "(" belongs to the overall conditional expression; if 
 it turns out they actually belonged to the first simple expression, 
 then the parse is resumed at point 6, which accepts the ")" and 
 continues parsing the simple expression. 
 
29 
 
 conditional -expr> 
 
 Rl 
 
 R2 
 
 €y^-€y 
 
 €> 
 
 ii \ ii 
 
 rrelatiQnal-oper]^c6) ^xprpssi^/^o/Y-, 
 
 Figure 3.7 - PL/I conditional -expression> transition diagram 
 
 3.3 Formal Description of the Transition Diagram Model 
 
 As a more formal description, a transition diagram system 
 consists of a set of nested push-down automata (NPDA) that have the 
 capability of invoking one another. Each NPDA is capable of read- 
 ing a portion of the input string and accepting or rejecting it. 
 Lomet [Lorn., '73] calls the NPDA that are capable of being invoked 
 "submachines"; the initial STATE of a submachine is known as its 
 "entry" STATE and a submachine is invoked by the use of its entry 
 STATE number by another NPDA. This results in the invoking STATE 
 being saved at the top of a parser stack and the parse resumed at 
 the new entry STATE. Each submachine also contains one or more 
 
30 
 
 "exit" STATEs; when an exit STATE is reached, the top of the stack 
 together with the particular exit STATE determine the STATE in the 
 original invoking NPDA with which to continue the parse. An error 
 in the parse is detected if a NPDA reads a token from the input 
 string for which there is no corresponding STATE transition that 
 the NPDA can make. The reader is referred to the discussion by 
 Lomet [Lorn., '73] for further technical details about transition 
 diagrams and the class of languages accepted by transition diagrams 
 
 3.4 Description of the Implemented Parser 
 
 A table-driven parser system has been designed that in- 
 corporates the actions described in the preceding sections of this 
 chapter. The syntax table is actually an encoding of the transi- 
 tion diagrams into a small, interpretable instruction set. The 
 syntax parser of the compiler consists of a routine that interprets 
 this instruction set in an appropriate way. (This routine can be 
 called the "table interpreter" or "table-driven" routine.) 
 
 The syntax parser has control over and maintains certain 
 tables and data structures within the compiler. All of these 
 tables and structures are located in a region of memory that is 
 referred to as the parser storage area. 
 
 One particular variable that is maintained is the parser 
 STATE variable. This STATE variable corresponds directly to the 
 nodes or states that have been described for a transition diagram. 
 
31 
 
 As each token is accepted by the parser, it is associated with the 
 particular STATE that the parser was in. If a token must be "un- 
 compiled," then the parser is reset to the associated STATE for 
 that token, that is, as if the token has not yet been parsed. The 
 value of this STATE variable always points to some instruction in 
 the syntax table. As the input string is parsed, this STATE vari- 
 able is updated (i.e., moved to point to a different instruction in 
 the table) to reflect the current state of the parse. 
 
 The parser maintains a regular parsing return-state stack 
 which is used when one transition diagram invokes another. Entries 
 on this stack all point to the location in the parser table where 
 the nested transition diagram was invoked. Any changes in the 
 parser stack are recorded in the parser-action history for the 
 token that is currently being parsed, allowing the stack to be pro- 
 perly reestablished when a token is "uncompiled." 
 
 The preceding discussion in this chapter describes a 
 parser model that is sufficiently powerful to recognize all deter- 
 ministic context-free languages in time proportional to the length 
 of the input string [Lorn., '73]. A few extensions to the model 
 have actually been included in the compiler's parser system to en- 
 able the handling of certain context-sensitive semantic require- 
 ments such as proper and consistent declaration of attributes for 
 an identifier, consistent references to declared array identifiers, 
 etc. These extensions include allowing auxiliary memory variables 
 
32 
 
 to be utilized by the parser (for operations such as counting the 
 number of subscripts in an array reference), allowing the labels on 
 any branch in a transition diagram to refer to any of these auxil- 
 iary variables or any symbol table field, and allowing special, 
 language-dependent semantic-subroutines to be invoked at appropri- 
 ate times during the operations of the parser system. Any changes 
 that are made to any of these auxiliary variables or symbol table 
 fields are recorded in the parser-action history for the token that 
 is being parsed. 
 
 A complete description of the design and operations of 
 the actual implemented parser can be found in [Tin., '75]. 
 
 3.5 Compiler Environment when Parser Signals Error 
 
 As stated previously, a syntax error is detected by the 
 parser if the attributes of the current input token do not match 
 any of the labeled branches for a STATE. When this happens, the 
 following environment is assumed by the error diagnostic system 
 that is described in the remainder of this thesis. 
 
 The intermediate text generated by the compiler is \/ery 
 simply a tokenized version of the source text. The advantages of 
 this are that only one copy of the program need be stored, "reverse 
 compilation" or "uncompiling" can be implemented easily without 
 complicated reversal of code generation, and the original program 
 can be displayed at any time simply by examining the intermediate 
 
33 
 
 text. As each token is output from the lexical analyzer, it is 
 entered into the rightmost end of the intermediate text. Thus, if 
 a syntax error is detected, the last token in the intermediate text 
 is always the illegal one that the parser could not accept. In 
 addition, no right context is available for examination; only the 
 intermediate text entries up to and including the error token are 
 available. 
 
 The parser-action history is a vector that is associated 
 with the intermediate text. Each entry in the history records some 
 change that occurred in the parser environment. A correlation is 
 maintained between entries in the history and tokens in the inter- 
 mediate text. 
 
 When a syntax error is detected, the current parser STATE 
 is available, as well as the history of all STATE changes that have 
 occurred for the previous tokens in the intermediate text. Simi- 
 larly, the current parser return-state stack is available, as well 
 as the history of all stack changes that have occurred. Finally, 
 the current symbol table and the auxiliary parser storage variables 
 that are used by the parser for the particular language are avail- 
 able; all changes to these data areas are recorded in the history. 
 
 This concludes the summary of the parser environment that 
 the error analysis routines can work with. 
 
34 
 
 Chapter 4 
 A MODEL FOR DIAGNOSTIC INTERACTION WITH THE PROGRAMMER 
 4.1 Goals for Diagnostic Interaction 
 
 This chapter describes a useful model of the interaction 
 that can occur between a programmer who has attempted to enter a 
 syntactically illegal token into the compiler and the error diag- 
 nostic system that is trying to describe the error to the 
 programmer. Because the programmer is assumed to be relatively 
 inexperienced at writing and debugging problems, the diagnostic 
 system must be very simple to use. All of the actions taken by the 
 diagnostic system must be transparent to the programmer. In par- 
 ticular, any messages that are displayed for the programmer must be 
 simple to understand and must be stated using the terminology that 
 the programmer is familiar with. Concepts such as "statements," 
 "subscripts," "expressions," etc., which are important in the pro- 
 gramming language and which are presumably recognized by the 
 programmer should be referred to when appropriate by the diagnostic 
 system. 
 
 In addition, the diagnostic system should make use of the 
 fact that the programmer is at a timesharing terminal and thus can 
 be used to control some aspects of the error analysis and recovery 
 operations. In particular, the programmer should have some control 
 over the level of detail that is given in diagnostic messages. If 
 
35 
 
 a "numeric expression" is referred to in a message, for example, 
 the programmer may or may not be interested in more detail about 
 the form of a "numeric expression"; while the more detailed infor- 
 mation should always be available, it should not always necessarily 
 be presented. 
 
 Likewise, as opposed to most batch-oriented compilers, 
 the interactive compiler's diagnostic system needs to make no 
 attempt to actually "correct" the error or to modify the program 
 and "recover" by itself. Since the programmer is directly avail- 
 able, the system should attempt to explain the error to him, re- 
 quiring however that he be the one to actually back up, edit, and 
 repair the program. 
 
 4.2 Suggesting "Possible Corrections" for the Programmer 
 
 To satisfy the goals for diagnostic interaction just 
 mentioned, we propose a diagnostic system that interacts by pre- 
 senting various suggestions of "possible corrections" of the 
 program to the programmer. After a suggestion has been presented, 
 the programmer then has control to either request that a different 
 or more detailed suggestion be made by the system or else to return 
 from the diagnostic system back to the editor system and proceed to 
 modify the input and attempt to correct the error. This technique 
 is y/ery simple for the programmer to use since the only interaction 
 that is required from him is that he examine the suggestions that 
 
36 
 
 are made and either request other suggestions or request the return 
 to the editor system. 
 
 The example shown in Figures 4.1(a) - (d) illustrates the 
 types of suggestions that can be made by the diagnostic system. It 
 is an example of a short PL/ 1 program segment in which the illegal 
 token "THEN" has been entered at the end of an assignment statement 
 by the programmer. Each successive display (a) - (d) was requested 
 by the programmer via the "HELP" key on the PLATO keyboard. In all 
 cases, upon detecting the "HELP" request, the diagnostic system 
 erased the current suggestion message that was being displayed and 
 presented the next message to the programmer; the program itself 
 was always in view on the screen, with only the diagnostic messages 
 being changed. 
 
 FILE = workspace PL/I WORKSPACE (1 4 -new) SFBCE » 260 
 
 y v v \r r v w " p l v w r v r r v r w — r~ — t 
 
 .TEST10: .PROC; 
 
 DECLfiRE. (I, J) .FIXED; 
 
 .LP1: GET. LIS T (I, J ) ; 
 
 I .-.J. . ffiHENJ 
 
 ****#**#**ERROR* ********* [back] or (twisi) to fix. 
 
 This reserved word is not permitted here. 
 Press jhelp) for more information. 
 
 Figure 4.1(a) - Example of suggestion messages 
 
37 
 
 FILE = ujo rk sf ace PL/ 1 WORKSPACE (1 4 - new) SPACE = 260 
 
 " '' " • r ' r w r r v r r w r r v ' r • r r r v v lC= y 
 
 TESTIS: .PROC; 
 DECLARE . (I,. J) .FIXED; 
 
 LPl: . . . .GET, LIS T (I, J) ; 
 I.-.J. . ffHETl 
 
 ********** Possible Correction ********** (back) or (ehhse) to fix. 
 Replace | J with an arithmetic operator. 
 Press (heTJp) for a different suggestion. 
 
 Figure 4.1(b) - Example (continued) 
 
 FILE = workspace PL/I WORKSPACE (1 4-new) SPACE =2 60 
 
 TEST10: .PROC; 
 
 DECLARE. (I, J) FIXED; 
 
 .LPl: GET LIS T (I, J) ; 
 
 I.-.J. . [fHENJ 
 
 ********** Possible Correction ********** (back) or (ebhse) to fix. 
 
 Replace P~] wi i h _"i"l 
 
 Press (help) for a different suggestion. 
 
 Figure 4.1(c) - Example (continued) 
 
38 
 
 FILE * workspace PL/ 1 WORKSPACE (1 4-neui) SPACE = 260 
 
 111 " r — r r r ' r . w v w r • w • w w w • r — r r r i> I 
 
 TESTIS: PROC; 
 DECLARE . (I, J) .FIXED; 
 
 LPl: GET. LIST (I, J) ; 
 
 g.-.J. . .THEN 
 
 ********** Possible Correction ********** [Back] or Ekhse) to fix. 
 Insert a reserved word "IF" in front of | | . 
 
 Figure 4.1(d) - Example (continued) 
 
 Note also that all of the available options for the 
 programmer are displayed with each message: in this example, the 
 "HELP" key is active for each message as well as the regular com- 
 piler editing "BACK" and "ERASE" keys which allow the programmer 
 to modify and correct the program. 
 
 A further example of the control that the programmer has 
 over the suggestions is given in Figures 4.2(a) - (e). In this 
 PL/I example, the programmer has attempted to declare identifier 
 "J" with conflicting attributes. (The "VARYING" attribute is 
 associated with character strings in PL/I, not decimal variables.) 
 The suggestion given in Figure 4.2(b) includes a different option 
 for the programmer: if another suggestion is needed, the 
 
39 
 
 programmer can either request to see more detail about "attributes" 
 that are consistent with "DEC" (the detailed suggestions are shown 
 in Figures 4.2(c) and (d)), or else can request that some other 
 different suggestions be made (Figure 4.2(e)). Note also that if 
 the programmer elects to see more detail, then following the sugges- 
 tion in Figure 4.2(d) (or by requesting a different suggestion in 
 Figure 4.2(c)), the next suggestion is the one shown in Figure 
 4.2(e); therefore the programmer will not miss seeing any sugges- 
 tions because of electing to see more detail on an earlier 
 suggestion. 
 
 Further examples are given in Appendix II. 
 
 FILE * workspace PL/ 1 WORKSPACE (1 4 -new) SPACE = 276 
 
 1 r r== r r r r- r 7 ? r v r w v r 9 r v t f 
 
 .TEST.: . .PROC; 
 
 DCL . I .PLORTj 
 
 DCL . .B DEC JVRRYINGl 
 
 ♦ ♦#**♦** #*ERROR#********* • ( back) or (erhse) to fix. 
 
 This attribute is not permitted here. 
 Press (help) for more information. 
 
 Figure 4.2(a) - "Detail" suggestion messages example 
 
40 
 
 FILE = workspace PL/I WORKSPACE (14 -new) SPACE = 276 
 
 TEST.: . .PROC; 
 
 DCL . I .FLOAT; 
 
 DCL . .B DEC . fWRYINGi 
 
 ********** Possible Correction ********** (back] or [erhse) to fix. 
 Replace | | with an attribute. 
 
 Press (help) for a d i f f erent suggest i on . 
 Press EEfrHELP) to see a legal attribute. 
 
 Figure 4.2(b) - "Detail" example (continued) 
 
 FILE = workspace PL/1 WORKSPACE (1 4 -new) SPACE = 276 
 
 — r r r y >.' r r ' r r ] r l w • w r r r v w w i 
 
 TEST.: . .PROC; 
 
 DCL . I .FLOAT; 
 
 DCL . .B DEC j/ARYINGl 
 
 ********** Possible Correction ********** @ or [erhse] to fix, 
 Rep lace _Q ] with_an attribute 2 F t°5 T _' , _ 
 Press (help) for a d i f f erent suggest i on . 
 Press SHDldlkD to see another legal attribute. , 
 
 Figure 4.2(c) - "Detail" example (continued) 
 
41 
 
 FILE ■ wo rks pace PL/I UIORKSPflCE (14-new) SPRCE * 276 
 
 TEST.: . .PROC; 
 
 DCL . I .FLORTj 
 
 DCL . .B DEC . jVRRYINq 
 
 ********** Possible Correction ********** (brcx) or [mat] to fix. 
 Replace P~] with an attribute "FIXED". 
 Press IHEifl for a different suggestion. 
 
 Figure 4.2(d) - "Detail" example (continued) 
 
 FILE « workspace PL/I UIQ >^<5PRCE(14-new) SPACE = 276 
 
 * ¥ * 4 r r r 
 
 TEST.: ..PROC; 
 
 DCL . I FLOAT; 
 
 DCL . B DEC . yflRYINGj 
 
 ********** Possible Correction ********** §55*1 or E^ to fix 
 RyUce j _ J wi th • , " . 
 
 Press £E£ for a different suggestion. 
 
 Figure 4.2(e) - ■Detail* example (continued) 
 
42 
 
 4.3 Algorithmic Notation 
 
 Short algorithms will be presented throughout this thesis. 
 The notation that will be used in stating these algorithms will be 
 a readable, Algol-like language that contains whatever language 
 constructs are necessary to most clearly illustrate the particular 
 algorithm being presented. The left arrow "«-" is the assignment 
 operator. Typical control constructs include an "if - then - else" 
 statement (which is terminated with a "fi" in all cases to avoid 
 ambiguity), a looping construct (generally "loop. . .while. . .re- 
 peat"), and a procedural "begin - end" construct. Also, in many 
 cases, certain operations in the algorithm will not be refined ex- 
 licitly when the algorithm is given; these undetailed operations 
 include both references to other procedures in the error analysis 
 system (denoted with double slashes "//" around a description of 
 the procedure) and also local operations that must be performed 
 (denoted with single slashes "/" around a description of the effect 
 of the local actions). 
 
 The notation used in these algorithms is not intended to 
 be rigorously defined; the goal is for the operation and effect 
 of the algorithm to be stated as clearly as possible. 
 
 4.4 Algorithm for Diagnostic Interaction 
 
 An algorithm that provides a suitable model for the 
 diagnostic interaction described in this chapter is given as 
 
43 
 
 Algorithm 'USER_INTERACTION' in Figure 4.3. This algorithm is 
 purposely simple and nondetailed; Chapter 5 will more fully de- 
 scribe a detailed implementation of this algorithm. 
 
 begin USER_INTERACTION 
 
 /initialize error system environment/ 
 //display: signal error to user// 
 loop : 
 
 //prepare new suggestion message// 
 comment "Read- edit- key" waits until an edit key or 
 the "HELP" key is pressed; the key-type is 
 returned in "Read- key." 
 /Read-edit-key/ 
 while Read-key = "HELP": 
 
 //erase: current message on screen// 
 //display: new message on screen// 
 repeat : 
 
 /reestablish program editing environment/ 
 /return to program editing mode/ 
 end USERJNTERACTION 
 
 Figure 4.3 - Algorithm 'USERJNTERACTION' 
 
44 
 
 Chapter 5 
 DETERMINING "POSSIBLE CORRECTION" SUGGESTIONS 
 
 5.1 Introduction 
 
 The largest problem to be solved in algorithm 
 "USER_INTERACTION" (shown in Figure 4.3) is that of determining 
 what "possible correction" suggestions can be made to the program- 
 mer as diagnostic help is requested. This chapter first describes 
 a preliminary model of an algorithm for determining suggestions 
 that is similar to the algorithm suggested by Levy [Lev., '71]. 
 Then a more sophisticated algorithmic model is given and shown to 
 produce more comprehensive suggestions than the preliminary model. 
 
 5.2 A First Model for Determining "Possible Correction" Suggestions 
 
 5.2.1 J. P. Levy 
 
 A thesis by J. P. Levy [Lev., '71] defined and studied 
 a number of problems associated with automatic error correction 
 techniques. If the "error token" is defined to be the input token 
 at which a syntactic error is detected, then the "actual syntactic 
 error" that has occurred is not necessarily located at the error 
 token, but may in fact be located someplace to the left of the 
 error token. Levy proved (Theorem, section I I I. 3) that the actual 
 error may not even be located "close" to the error token, but could 
 have occurred an unbounded distance to the left of that token. 
 
45 
 
 However, for any particular error situation, Levy showed 
 it is possible to find a location in the input string such that the 
 actual syntactic error could not have occurred before that location: 
 this location is defined as the "left context" of the error. 
 (Locating this place in the input string, which is called a 
 "beacon," is a difficult problem and Levy devotes much of his the- 
 sis to its solution.) The fact that such a "left context" location 
 exists is trivially true, since the extreme limit for its location 
 is the beginning of the input string; however, Levy shows that it is 
 possible for the left context location to be located much closer to 
 the error token than the beginning of the input string in many 
 situations. 
 
 Then Levy defines "error interpretations" of the input 
 string to be syntactically correct strings that are similar to the 
 original input string, but contain a few changes or "corrections"; 
 if the number of changes made to the original string to arrive at 
 an interpretation is K, then the original string is said to have K 
 errors. He shows that all interpretations with K changes will be- 
 come "equivalent" at some location to the right of the error point 
 if it is possible to interpret the string correctly with only K 
 changes; this equivalence means that the parser has settled into a 
 common state for all of the interpretations at that point, and thus 
 the parse for the rest of the string is similar for all interpre- 
 tations. 
 
46 
 
 Levy defines three operations that are used in building 
 the error interpretations: "R," "A," and "S." The "R" operation 
 specifies that, at each point in the input string, alternate 
 parses (interpretations) should be created by replacing the actual 
 input token with each of the tokens that would be syntactically 
 acceptable at that point. The "A" operation specifies that, at 
 each point, alternate parses should be created by adding to the in- 
 put string each of the tokens that would be syntactically accept- 
 able. Finally, the "S" operation specifies that, at each point, an 
 alternate parse should be created by suppressing from the input 
 string the actual input token that was there. 
 
 These three operations can be conveniently expressed as 
 follows: if the input string at some point is expressed as 
 
 xa 
 where x is the first token not read by the parser and a is a string 
 containing the remainder of the input tokens, then if t is an 
 acceptable token at this point, the "R" operation changes the input 
 to 
 
 R t (xa) = ta, 
 the "A" operation changes the input to 
 
 A t (x a ) = tx a , 
 and the "S" operation changes the input to 
 
 S t (xa ) = a . 
 
47 
 
 Levy proves that all syntax errors can be corrected by some combi- 
 nation of these three operations. 
 
 Finally, Levy describes an error correction algorithm 
 that operates as follows: following the detection of an error, 
 back up in the input string until the left context location is 
 reached; then reparse the input string from this point forward cre- 
 ating all possible modified interpretations of the input string 
 with N or fewer changes (for some fixed N) until all of the inter- 
 pretations are equivalent, that is, until the right context for the 
 error is found; then select one of the interpretations with the 
 smallest number of changes as the "correction" for the program. 
 
 5.2.2 Modifications to Levy's Analysis 
 
 The algorithm described by Levy can be modified to make 
 a good preliminary model for determining "correction" suggestions 
 when used with a transition diagram parsing system. However, a 
 number of important changes must be made to apply Levy's basic 
 ideas in an interactive environment. For one thing, in Levy's 
 system, all of the interpretations had to be initially computed for 
 the error situation, and then a "best" interpretation selected and 
 used to correct the program. However, in an interactive environ- 
 ment, it is not necessary for the error system to guess which error 
 interpretation is the "best" interpretation; the interpretations 
 can be given successively to the programmer, and the programmer can 
 
48 
 
 then select the appropriate interpretation and use it to correct 
 the program. In this way, the error system is acting not so much 
 like an automatic error correction and recovery system as like an 
 automatic error "consultant." When an error is detected by the 
 compiler, the error system then provides the programmer with the 
 different possibilities for correcting the error. 
 
 Since the different error interpretations are going to 
 be given successively to the programmer, it is not necessary for 
 all of the interpretations to be computed immediately upon the de- 
 tection of an error by the compiler. In an interactive environment, 
 it is important to try to spread the computing load out over a 
 period of time if possible, and to avoid heavy bursts of calcula- 
 tions in short periods of time. Also, since the programmer will 
 frequently examine more than one error interpretation for an error 
 situation, it is desirable that the suggested corrections to the 
 program be given to the programmer in an organized order. 
 
 It is proposed that instead of the error system initially 
 moving back in the input string to the left context location and 
 then starting the computations of the error interpretations, the 
 error system should compute the interpretations in the reverse 
 order. Initially, the error system will compute all error inter- 
 pretations assuming that the entire input string up to the error 
 token is correct and the error token itself is in error. As dif- 
 ferent interpretations are found, they are presented to the 
 
49 
 
 programmer as in algorithm "USER_INTERACTION" (Figure 4.3). When 
 no more interpretations can be found from this input location, 
 then all interpretations (if any) will be computed from one token 
 farther back in the input string, and so on. At each of these new 
 locations the algorithm makes the assumption that the error token 
 itself is correct and that the "actual syntax error" occurred some- 
 place back in the input string. 
 
 The backup process should terminate when a token is 
 reached in the input string where it is certain that the error act- 
 ually occurred after that token. Levy defines this left context 
 location as a "beacon," and he thoroughly discusses the problems 
 associated with locating the beacon for a given error situation. 
 The reader is urged to consult this work for the details; this 
 current thesis simply assumes that such a "beacon" location can be 
 found. 
 
 This algorithm is intuitively good from the programmer's 
 point of view, since the more "local" corrections are suggested 
 first, followed later by more "global" suggestions. From a compu- 
 tational point of view, this algorithm is also by far the best 
 approach. It is wery easy to compute the initial error interpreta- 
 tions, but much more time-consuming to calculate interpretations 
 with modifications that occur far back in the input string. The 
 hope is that usually the initial interpretations and suggestions 
 will satisfy the programmer and, while always available, only occa- 
 sionally will more complex interpretations need to be determined. 
 
50 
 
 Finally, note one more major departure from Levy's 
 system. In the interactive compiling environment that was de- 
 scribed in the previous chapters, there is no right context avail- 
 able when an error is detected and analyzed. Therefore, although 
 the input context to the left of the error token is still available 
 and important, there is no context to the right of the error token 
 in the input string. Because of this, the error interpretations 
 as described by Levy are "equivalent" in the interactive environ-, 
 ment when all of the input string up to and including the error 
 token has been considered. Another way of describing this is to 
 say that the error system can compute error interpretations of the 
 input string that just consider the left context and the actual 
 error token; it is then the responsibility of the programmer to 
 (mentally) supply the intended right context (that has not yet been 
 entered into the program) to decide if a particular error interpre- 
 tation constitutes the appropriate correction to the program. Thus, 
 the overall diagnostic system in an interactive environment re- 
 quires the joint participation of both the analysis system's inter- 
 pretation routine and the analysis of the programmer to 
 satisfactorily diagnose the syntax error. 
 
 5.2.3 Token Option Lists 
 
 Before the preliminary model of the "suggestion builder" 
 routine can be given, one more important routine must be mentioned. 
 
51 
 
 It is possible, for each state in a transition diagram, to con- 
 struct a list containing all of the tokens that are acceptable to 
 the parser if it is in that state. This list corresponds to what 
 Griffiths [Gri., '74] has called a "director symbol set" in LL(k) 
 terminology; it will be called the "token" option list in this 
 thesis, and the routine that builds the option list for a state 
 will be called the OPT routine. 
 
 It is very important to realize that the acceptable 
 token list for a state depends not only on just the transition 
 diagrams themselves, but also on a particular complete parser en- 
 vironment; the token list for a state will vary depending on other 
 elements of the parser's environment. If all of the branches in 
 the transition diagram that leave state S are labelled with just 
 token specifications, then those tokens comprise the entire list 
 for state S. However, if one of the branches invokes another 
 transition diagram (with entry state Q), then any simple tokens 
 that are specified for the original state, S, must be added to the 
 list, and then the entry state Q for the invoked transition diagram 
 must be examined as above. Likewise, if state S is a final state 
 for some transition diagram, then any simple tokens that are spec- 
 ified for state S must be added to the list, and then the current 
 transition diagram exited according to the parser's stack at that 
 time . Eventually, all of the possible tokens will be resolved and 
 added to the list, since the parser accepts only actual tokens in 
 the input string. 
 
52 
 
 begin transit (T, S) 
 
 comment T is an input token, and S is a parser state. 
 
 comment transit is a function that returns a value. 
 The value is a new state number if a parser 
 state transition from STATE S with token T can be 
 made. 
 Otherwise the value is "null." 
 
 comment The function is not detailed here since the explicit 
 actions depend directly upon the compiler's imple- 
 mentation. 
 
 end transit 
 
 Figure 5.1 - The "transit" function 
 5.2.4 The Preliminary Model 
 
 Given that this token option list can be determined for 
 any state, the preliminary algorithm consists of three routines: 
 "MODIFY," "REPARSE," and "USERJNTERACTION." MODIFY will make a 
 modification to the input string and invoke REPARSE to determine 
 whether the corrected string is then acceptable. If so, REPARSE 
 will invoke USER_INTERACTION; otherwise it will recursively invoke 
 MODIFY to make further modifications if the current interpretation 
 is still not acceptable. 
 
 Both MODIFY and REPARSE use a function "transit." A 
 
53 
 
 description of the effect of the transit function is given in 
 Figure 5.1. The function accepts two parameters: 
 
 T, an input token, and 
 
 S, a state of the parser. 
 The function checks to see if a legal parser state transition can 
 be made from STATE S with the token T. If so, then transit returns 
 the value of the new state number; otherwise transit returns a 
 "null" value. 
 
 The MODIFY routine has three input parameters: 
 
 L, a pointer to some location in the input string 
 (indexed from the beginning of the string), 
 
 Q, the corresponding parser state that is expecting the 
 token at location L, and 
 
 K, a counter indicating the number of changes that have 
 currently been made to the input string. 
 When the MODIFY routine is activated, first the token option list 
 for state Q is constructed. Then, for each token option in the 
 list, two modifications can be made: Add the token option to the 
 input string in front of the token at location L, or Replace the 
 token at location L with the token option. After the above modifi- 
 cations have been tried for each token option, a final modification 
 consisting of Suppressing the actual token at location L from the 
 input string can be tried. 
 
 Further describing these operations, if the MODIFY 
 
54 
 
 routine tries an Add operation, it will try to add the token option 
 to the input string in front of the token at location L; if the 
 original input string is ><a, and the token option is t, then the 
 modified input string is txa, starting in parser state Q. However, 
 this can be viewed with the same effect as the modified input string 
 xa starting in parser state CL, where Q t is the state of the parser 
 following the parsing of token t when in state Q (that is, Q t = 
 transit (t, Q)). Said differently, the Add operation is 
 
 A. ( xa) = txa = xa, 
 
 t (Q) (Q) (Q t ) 
 where the lower line indicates the state of the parser at that 
 location in the input string. 
 
 Similarly, the Replace operation can be specified as 
 
 R + ( x a ) = ta = a, 
 1 (Q) (Q) (QJ 
 
 and the Suppress operation can be specified as 
 
 S. ( xa) = a. 
 * (Q) (Q) 
 
 For each of the modifications described above, the re- 
 mainder of the modified input string must then be reparsed to see 
 if it now comprises a correct, syntactically legal string. The 
 reparsing controller, called REPARSE, has the same formal input 
 parameter list as does the MODIFY routine: 
 
55 
 
 LR, a pointer to some location in the input string 
 (indexed from the beginning of the string), 
 
 QR, the corresponding parser state that is expecting the 
 token at location LR, and 
 
 KR, a counter indicating the number of changes that 
 have currently been made to the input string. 
 REPARSE is a highly self-recursive routine. When invoked, the 
 routine first checks to see if the current LR input location has 
 reached the end of the input string. If so, then the reparse is 
 successful and a correct interpretation has been found. Whatever 
 modifications were made to the input string can be used as a "pos- 
 sible correction" suggestion by the error system routine 
 "USER_I INTERACTION," since making those corrections to the input 
 string yields a syntactically correct interpretation of the input. 
 
 If the input is not yet exhausted for the current invok- 
 ing of REPARSE, then the routine will first extend the current 
 reparse attempt by one more token if the token at location LR is 
 acceptable to the parser from STATE QR. The extension operation 
 consists of recursively invoking the REPARSE routine with the input 
 pointer and corresponding parser state advanced by one token. 
 
 Finally, the REPARSE routine will recursively invoke the 
 MODIFY routine at the current input location if the maximum number 
 of modifications have not already been made in the current reparse 
 attempt. The MODIFY routine will then make some modifications at 
 the new location in the input and again initiate the reparsing. 
 
56 
 
 The effect of the operation of the MODIFY and REPARSE 
 routines is to construct all possible alternate legal interpreta- 
 tions of the input string with N or fewer modifications. Further- 
 more, since the REPARSE routine tries to extend the current reparse 
 attempt before forcing another modification to be made, legal 
 interpretations first are found with modifications closest to the 
 original error token location ("local" modifications), with later 
 legal interpretations containing modifications farther away from 
 the error token ("global" modifications). 
 
 Algorithms for the token MODIFY and the REPARSE routines 
 are given in Figures 5.2 and 5.3, respectively. In the REPARSE 
 algorithm, "error_token_location" refers to the location of the 
 original token that caused the parser to signal an error; this lo- 
 cation is the location of the rightmost token in the input string. 
 Also in the REPARSE algorithm, the action "success" activates the 
 "USER_INTERACTION" algorithm with a "possible correction" sugges- 
 tion; for clarity, this linkage is not given in Figure 5.3. 
 
 In the MODIFY algorithm, notice the three different 
 references to the REPARSE algorithm. In all three, the change 
 count field KR is incremented by 1 to indicate that a new modifi- 
 cation has been made. The specification of a REPARSE LR location 
 of L + 1 indicates that the reparse attempt is to begin with the 
 next token in the input string following the one at location L. 
 This corresponds to the preceding discussion on the effect of the 
 Add, Replace, and Suppress operations. 
 
57 
 
 begin MODIFY (L, Q, K) 
 
 comment L is the input location pointer (indexed from the 
 beginning of the input string), 
 Q is the parser state for token at location L, 
 K is number of modifications that have been made 
 prior to current activation. 
 // OPT : build token o ption list for state Q// 
 for all te token_option_l ist loop : 
 
 // REPARSE (L, transit (t, Q), K + 1)// comment Add 
 
 operation. 
 // REPARSE (L + 1, transit (t, Q) , K + 1 )// comment 
 Replace operation. 
 repeat 
 
 // REPARSE (L + 1, Q, K + 1) // comment Suppress operation. 
 end MODIFY 
 
 Figure 5.2 - "Token" MODIFY routine 
 
58 
 
 begin REPARSE (LR, QR, KR) 
 
 comment LR is reparse start input location index, 
 
 QR is reparse start state for location LR, 
 
 KR is number of modifications made to input string. 
 
 comment Is reparse attempt successful? 
 
 if LR > error_token_location then "success"; 
 
 else 
 
 comment Extend reparse attempt 1 token if next token 
 
 is acceptable. 
 
 if transit (token (LR), QR) f "null" then 
 
 //REPARSE (LR + 1, transit (token (LR), QR) , KR)// 
 
 II 
 
 comment Make another modification at location LR, if 
 
 possible, to build all legal interpretations. 
 
 if KR < N then 
 
 //MODIFY (LR, QR, KR)// 
 
 fi 
 
 II 
 
 end REPARSE 
 
 Figure 5.3 - REPARSE routine 
 
 5.2.5 Examples using the Preliminary Model 
 As an example of the analysis performed by the preliminary 
 model that has been presented, consider the transition diagram El 
 
59 
 
 shown in Figure 3.1. For transition diagram El, Table 5.1 shows 
 the token options that can be computed for each of the different 
 states. 
 
 A few specific unacceptable input strings, along with 
 some possible interpretations for these strings, are shown in Table 
 5.2. This table gives the original token input string and then a 
 number of different "corrected" strings; for each "corrected" 
 string interpretation, the number of modifications required to ob- 
 tain the error interpretation is given, as well as a sample "sug- 
 gestion" message that could be used to describe the modifications 
 that were made. 
 
 STATE 
 
 TOKEN 
 
 OPTIONS 
 
 
 "b" 
 
 "b," 
 
 "d" 
 
 "c" 
 
 
 "d" 
 
 
 "e" 
 
 
 none 
 
 
 ii .pi 
 
 
 SI 
 S2 
 S3 
 S4 
 S5 
 S6 
 S7 
 
 Table 5.1 - Token Options for States of Transition 
 
 Diagram El 
 
60 
 
 E 
 rO 
 
 o 
 
 S- 
 4- 
 
 <D 
 
 CD 
 ■=C 
 
 co 
 co 
 
 CO 
 
 LU UJ 
 _J CD 
 Q_ CD 
 
 2: z=> 
 
 <C CO 
 
 co = 
 
 
 
 
 3 
 
 
 - 
 
 
 z 
 
 
 
 z 
 
 
 -a <+- 
 
 
 
 
 Q. 
 
 
 
 
 
 z 
 
 
 "O 
 
 
 e o 
 
 
 
 
 E 
 
 
 4- 
 
 
 4- 
 
 • 
 
 
 z 
 
 
 ro 
 
 z 
 
 - 
 
 - 
 
 •r- 
 
 z 
 
 o 
 
 z 
 
 O 
 
 o 
 
 z 
 
 
 
 4-> 
 
 • 
 
 • 
 
 • 
 
 
 • 
 
 
 • 
 
 
 z 
 
 • 
 
 T3 
 
 
 z E 
 
 CD 
 
 _Q 
 
 o 
 
 ai 
 
 M- 
 
 +-> 
 
 a> 
 
 +-> 
 
 
 jQ 
 
 £ 
 
 
 x> o 
 
 = 
 
 = 
 
 = 
 
 x: 
 +-> 
 
 = 
 
 c 
 o 
 
 = 
 
 o 
 
 +-> 
 
 - 
 
 rO 
 
 
 z S_ 
 4- 
 
 .E 
 
 .e 
 
 x: 
 
 
 sz 
 
 s- 
 
 x: 
 
 t_ 
 
 •r- 
 
 X 
 
 z 
 
 
 X: 
 
 +-> 
 
 -t-> 
 
 +-> 
 
 E 
 
 +-> 
 
 4- 
 
 +-> 
 
 4- 
 
 2 
 
 +-> 
 
 #i 
 
 
 -4-> E 
 
 • ^ 
 
 • r™ 
 
 • r~ 
 
 O 
 
 •r- 
 
 
 •^ 
 
 
 
 •^ 
 
 O 
 
 
 •i— "i — 
 
 2 
 
 s 
 
 s 
 
 S- 
 
 3 
 
 e 
 
 s 
 
 c 
 
 z 
 
 2 
 
 z 
 
 
 2 _ 
 
 
 
 
 4- 
 
 
 •r— 
 
 
 •r™ 
 
 4- 
 
 
 
 
 = 4- 
 rO - 
 
 u 
 
 -a 
 
 4- 
 
 z 
 
 u 
 
 — 
 
 ■b 
 
 z 
 
 
 ~ai 
 
 n 
 
 
 z 
 
 z 
 
 z 
 
 rO 
 
 z 
 
 ro 
 
 z 
 
 r0 
 
 CL) 
 
 z 
 
 XJ 
 
 z 
 
 z 
 
 
 
 
 z 
 
 
 z 
 
 
 z 
 
 O 
 
 
 z 
 
 • 
 
 +J 
 
 at 
 
 CD 
 
 a> 
 
 
 cu 
 
 
 a; 
 
 
 rO 
 
 OJ 
 
 
 <d 
 
 CD S_ 
 
 u 
 
 o 
 
 a 
 
 ai 
 
 u 
 
 +-> 
 
 o 
 
 +J 
 
 n— 
 
 o 
 
 +-> 
 
 z 
 
 o a» 
 
 rO 
 
 rO 
 
 ro 
 
 > 
 
 ro 
 
 S- 
 
 rO 
 
 S- 
 
 Q. 
 
 ro 
 
 S_ 
 
 
 rO CO 
 
 ^~ 
 
 ^~ 
 
 ^~ 
 
 o 
 
 i^ 
 
 <u 
 
 ^— 
 
 O) 
 
 CD 
 
 1— 
 
 <U tj- 
 
 r— E 
 
 Q. 
 
 Q- 
 
 Q. 
 
 E 
 
 Q. 
 
 CO 
 
 Q. 
 
 co 
 
 a: 
 
 a. 
 
 (/) 
 
 O 
 
 Q.H- 1 
 
 CD 
 
 01 
 
 a) 
 
 CD 
 
 a> 
 
 E 
 
 ai 
 
 e 
 
 
 <D 
 
 c 
 
 
 a; 
 
 Cd 
 
 ai 
 
 cc 
 
 cc 
 
 ai 
 
 i— i 
 
 a; 
 
 1— « 
 
 
 o; 
 
 1— 1 
 
 
 ct 
 
 rO 
 
 S- 
 en 
 ro 
 
 Q 
 
 O 
 •r— 
 ■4-> 
 •r" 
 
 CO 
 
 E 
 fO 
 
 s- 
 
 o 
 
 Q 
 
 a. r^ 
 o 
 oo 
 
 a; lu 
 
 UJ CD 
 
 CM 
 
 CO 
 
 CM 
 
 to 
 
 E 
 
 O 
 •r- 
 +J 
 
 rO 
 4-> 
 
 CD 
 
 s_ 
 
 Q. 
 
 S- 
 <D 
 
 +-> 
 
 i. 
 o 
 
 s- 
 
 o_ 
 a: a: 
 
 O LxJ 
 OL \— 
 
 a: : 
 
 
 
 
 
 
 
 
 
 
 CD 
 
 
 
 
 
 
 
 
 
 -o 
 
 
 T3 
 
 
 CD 
 
 U 
 
 o 
 
 
 
 O 
 
 Cut 
 
 o 
 
 
 O 
 
 CD 
 
 "O 
 
 J3 
 
 JD 
 
 4- 
 
 4- 
 
 .a 
 
 4- 
 
 JD 
 
 XI 
 
 XI 
 
 4- 
 
 fO 
 
 fO 
 
 fO 
 
 JD 
 
 J2 
 
 ro 
 
 J3 
 
 ro 
 
 ro 
 
 ro 
 
 XI 
 
 I 
 
 CM 
 LO 
 
 0) 
 
 x> 
 ro 
 
 CD 
 
 t— 
 CO 
 
 o ; 
 
 u 
 
 A3 
 
 X) 
 rO 
 
 U 
 XJ 
 
 CD 
 ro 
 
61 
 
 STATE TOKEN OPTIONS 
 
 51 "a," "b" 
 
 52 "f," "b" 
 
 53 "d" 
 
 54 "e" 
 
 55 "c" 
 
 56 "f," "b" 
 
 57 "g" 
 
 58 "c" 
 
 59 "d" 
 SIO none 
 
 Table 5.3 - Token Options for States of Transition 
 Diagrams E2 and E3 
 
 As a second example of the analysis performed by the 
 diagnostic model, consider the more complex transition diagrams E2 
 and E3 shown in Figure 3.3. Table 5.3 shows the different token 
 options that can be computed for each state in the diagrams; notice 
 that to obtain the options for state S2 requires examining the 
 token options for the entry state for diagram E3, state S6, and 
 similarly to obtain the options for state S9 requires returning 
 from diagram E3 back to state S3, and then examining the token op- 
 tions for state S3. 
 
 A few unacceptable input strings for the diagrams E2 and 
 E3, along with some possible error interpretations for these 
 strings, are shown in Table 5.4. 
 
62 
 
 co 
 CO 
 
 LlJ 
 
 CO 
 
 —I C3 
 
 ct CO 
 CO = 
 
 
 
 o 
 
 u 
 
 
 
 s_ 
 
 = 
 
 
 
 4- 
 
 4- 
 
 z 
 
 - 
 
 c 
 
 O 
 
 • 
 
 « 
 
 •r— 
 
 
 «f- 
 
 x) 
 
 
 ■M 
 
 — 
 
 — 
 
 s 
 
 C 
 
 
 
 en 
 
 o 
 
 _sr 
 
 .c 
 
 r 
 
 S- 
 
 4-> 
 
 +j 
 
 
 4- 
 
 *r" 
 
 •^ 
 
 •a 
 
 
 s 
 
 3 
 
 c 
 
 c 
 
 
 
 fO 
 
 •f"" 
 
 <D 
 
 
 
 
 
 ■o 
 
 
 
 
 
 E +J 
 
 
 
 
 
 (O £= 
 
 — 
 
 
 
 — 
 
 O 
 
 • 
 
 
 • 
 
 • 
 
 = s- 
 
 _Q 
 
 
 +-> 
 
 Ol 
 
 X) 4- 
 
 — 
 
 
 3 
 
 r 
 
 z 
 
 
 
 Q. 
 
 
 C 
 
 4- 
 
 
 C 
 
 4- 
 
 4- -r- 
 
 O 
 
 = 
 
 •r- 
 
 O 
 
 O _ 
 
 4-> 
 
 ■o 
 
 <L> 
 
 +J 
 
 +-> T3 
 
 C 
 
 s 
 
 .c 
 
 C 
 
 C = 
 
 o 
 
 
 +-> 
 
 o 
 
 o 
 
 i- 
 
 .c 
 
 
 S- 
 
 s_ -o 
 
 4- 
 
 +J 
 
 E 
 
 4- 
 
 4- C 
 
 
 •r- 
 
 o 
 
 
 rO 
 
 c 
 
 2 
 
 s- 
 
 C 
 
 c 
 
 •r - 
 
 
 4- 
 
 •r— 
 
 •r— — 
 
 rO 
 
 <o 
 
 0) 
 
 (0 
 
 u 
 
 o = 
 
 ro 
 
 0) 
 
 CL) 
 
 Ol 
 
 -o 
 
 
 CD 
 
 <u 
 
 
 01 
 
 
 
 S- 
 
 o 
 
 o 
 
 +j = 
 
 +-> 
 
 O 
 
 u 
 
 +-> 
 
 o 
 
 o> 
 
 -t-> 
 
 4-> <D 4- 
 
 ro 
 
 <o 
 
 S- 
 
 S- 
 
 rO 
 
 re 
 
 S- 
 
 ro 
 
 > 
 
 s- 
 
 S- w o 
 
 
 
 <D 4- 
 
 (U 
 
 
 ^~ 
 
 <u 
 
 !— 
 
 o 
 
 o> 
 
 (1) c 
 
 Q. 
 
 Q. 
 
 W O 
 
 (/> 
 
 Q. 
 
 o. 
 
 in 
 
 Q. 
 
 E 
 
 </) 
 
 (/> i— i 
 
 Ol 
 
 <D 
 
 c 
 
 c 
 
 CD 
 
 (U 
 
 c 
 
 0J 
 
 O) 
 
 e 
 
 c 
 
 cc 
 
 cc 
 
 1— 1 
 
 1— 1 
 
 cc: 
 
 cc 
 
 ►—i 
 
 ct: 
 
 o; 
 
 t— 1 
 
 l-H 
 
 CO 
 LU 
 
 •a 
 
 c 
 re 
 
 CM 
 
 LlJ 
 
 in 
 
 E 
 ro 
 
 s- 
 cn 
 
 c 
 o 
 
 in 
 
 c 
 
 rO 
 
 i. 
 
 o 
 
 Q 
 <=C 
 
 LU 
 
 O 
 CO 
 
 CC LU 
 
 LU C3 
 
 CQ 
 
 i— r— C\J 
 
 
 in 
 
 c 
 o 
 
 •r— 
 +J 
 rO 
 +■> 
 <L) 
 S- 
 Q. 
 S- 
 
 a> 
 +-> 
 
 c 
 
 cc 
 a. 
 
 cc cc 
 
 O lu 
 
 cc I 
 
 cc 
 
 
 
 
 
 
 
 
 
 
 
 Ol 
 
 
 
 T3 
 
 
 
 
 -b 
 
 -b 
 
 
 
 XJ 
 
 
 
 cn 
 
 ~o 
 
 
 ~a> 
 
 o 
 
 o 
 
 ~CD 
 
 ~<U 
 
 o 
 
 4- 
 
 lb 
 
 4- 
 
 XI 
 
 ~o 
 
 o 
 
 .a 
 
 -Q 
 
 O 
 
 O 
 
 J=> 
 
 ra 
 
 fO 
 
 f0 
 
 rO 
 
 XI 
 
 XI 
 
 rO 
 
 ro 
 
 -Q 
 
 X) 
 
 fO 
 
 o 
 
 LO 
 
 CD 
 
 ^- 
 X) 
 
 ro 
 
 CJ3 
 
 cc 
 
 \- 
 co 
 
 i^ CU 
 
 o z: 
 
 -o 
 
 fO 
 
 
 T3 
 
 CU 
 
 o 
 
 
 a 
 
 O 
 
 JD 
 
 ~0) 
 
 ro 
 
 XJ 
 
 « 
 
 X) 
 
63 
 
 5.3 Extended Suggestion Model 
 
 5.3.1 Using Nonterminal Information 
 
 One of the objectives of an interactive diagnostic 
 system is to provide for the programmer as much information about 
 the state of the parse and the internal state of the compiler 
 as is possible. The preliminary suggestion model presented 
 earlier in this chapter (Figures 5.1 and 5.2) is able to provide 
 much information about modifying individual tokens in the input 
 string to correct the error. However, no information is available 
 in the preliminary diagnostic model about higher-level language 
 features such as "expressions," "subscripts," "statements," etc., 
 even though the transition diagram parser system that is being 
 used in the compiler provides and uses this information. 
 
 In Chapter 3 of this thesis, the action of a transition 
 diagram parser is described as making a state transition from a 
 current state to some next state, depending on the input token and 
 the labeled branches for the current state. A regular parser stack 
 is used to allow the invoking of "sub" transition diagrams when 
 specified on a branch; this operation corresponds to attempting 
 to parse a "nonterminal" in the programming language, and these 
 nonterminals typically represent higher-level language constructs 
 in the language. 
 
 > 
 
 Therefore, for any state in the parser, it is possible 
 
64 
 
 to determine not only what actual tokens are acceptable parsing 
 options, but also what nonterminals , if any, are being sought. It 
 is possible, for any state, to build a list containing all token 
 options and all intermediate nonterminals that are acceptable to 
 the parser. The algorithm for this is almost the same as the al- 
 gorithm for the token option builder that was described in section 
 5.2.3; the only difference is that when a branch for a state refers 
 to a nonterminal, then first that actual nonterminal is added to 
 the list, and then the nonterminal's entry state is examined. This 
 new list will be called the general option list . 
 
 It is also possible to view the input string either as 
 the string of actual tokens that have been input, or else as a 
 string of tokens and nonterminals that have been parsed. Using the 
 example E3 given in Figure 3.3, the input string "abed" can be 
 viewed either as 
 
 "a b c d" 
 or as 
 
 "a E3 d." 
 
 To use these new ways of viewing the input string and 
 the requirements of the parser for any state, it is necessary to 
 change the activities of the MODIFY routine. In the preliminary 
 version of MODIFY the idea was, for each token option in the token 
 option list for a state, to create alternate interpretations of 
 the input by applying the "A" and the "R" operations, and 
 
65 
 
 subsequently the "S" operation, with respect to the next token 
 in the input string. In the new version, we want to create alter- 
 nate interpretations of the input by applying, for each general 
 option in the general option list , the "A" and the "R" operations, 
 and subsequently the "S" operation, with respect to both the next 
 token and the next nonterminal (if any) in the input string. 
 
 More formally, if the input string is 
 
 x 3 a 
 and the nonterminal B is 
 
 B = x 3, 
 
 then the input can be viewed as 
 
 x 3 a or B a, for state Q. 
 The "A" operation, for some general option , g, changes the input to 
 
 A ( x3a)= gx ea 
 9 (Q) (Q) 
 
 x 3 a, 
 
 <V 
 
 or 
 
 A ( B a ) = gBa= Ba, 
 9 (Q) (Q) (Q g ) 
 
 where Q is the parser state following the parsing of option g 
 from state Q. This operation, then, specifies that a general op- 
 tion (that is, some acceptable token or some acceptable nonterminal) 
 be added to the input string in front of the remainder of the 
 
66 
 
 string. Note that for the "A" operation, it makes no difference 
 how the remainder of the input is viewed, since the operation 
 simply inserts an option in front of it. 
 
 For the "R" and "S" operations, however, viewing the in- 
 put string as a simple string of tokens as opposed to a string of 
 tokens and nonterminals makes a big difference. The "R" operation, 
 for some general option, g, changes the input to either 
 
 R_ ( x 3 a) = g e a 
 9 (Q) (Q) 
 
 3 a, 
 
 or 
 
 R ( B a) = g a = a, 
 
 9 (Q) (Q) (Q g ) 
 
 depending on how the input is viewed. This operation implies the 
 replacement of both the token and the nonterminal (if any) in the 
 input string, with each general option for state Q (which may it- 
 self be either a token option or a nonterminal option); thus, all 
 replacement combinations of tokens and nonterminals are performed. 
 Similarly, the "S" operation changes the input to either 
 
 S_ ( x g a) = ga, 
 9 (Q) (Q) 
 
67 
 
 or 
 
 S. ( B a) = a, 
 9 (Q) (Q) 
 
 depending on how the input is viewed. This operation implies the 
 suppression of both the token and the nonterminal (if any) in the 
 input string. 
 
 The second version of the MODIFY routine is given in 
 Figure 5.4; this version incorporates the general option concept. 
 The REPARSE routine is the same for this version of MODIFY as 
 
 begin MODIFY (L, Q, K) 
 
 comment L is the input location pointer (indexed from the 
 beginning of the input string), 
 Q is the parser state number for token at location L, 
 K is number of modifications that have been made 
 prior to current activation. 
 //OPT : build general option list for state Q// 
 /Lnt «- location in input following the nonterminal that 
 
 begins with token at location L; if no nonterminal begins 
 there, set Lnt «- 0/ 
 for all ge general_option_list loop : 
 
 comment REPARSE routine's 4th parameter is set to 
 
 1) 'unsuccessful' : if previous modification did 
 not completely "correct" input string, 
 
68 
 
 2) 'help' : if previous modification "corrected" 
 
 input string, but programmer requested 
 another suggestion 
 upon returning to MODIFY. 
 comment Try Add operation. 
 
 //REPARSE (L, transit (g, Q), K + 1, last_try)// 
 comment Prune special case for Replace operation, 
 rf L f error_token_location or last_try = 'unsuccessful' 
 then 
 
 comment Token Replace operation. 
 //REPARSE (L + 1, transit (g, Q), K + 1, last_try)// 
 comment Does nonterminal exist? 
 rf Lnt f then 
 
 comment Nonterminal Replace operation. 
 //REPARSE (Lnt, transit (g, Q), K + 1, 
 last_try)// 
 
 II 
 
 f_i 
 
 repeat 
 
 comment Prune special case for Suppress operation. 
 
 rf L f error_token_location then 
 
 comment Token Suppress operation. 
 
 //REPARSE (L + 1, Q, K + 1, last_try)// 
 
 comment Does nonterminal exist? 
 
69 
 
 if Lnt t then 
 
 comment Nonterminal Suppress operation. 
 //REPARSE (Lnt, Q, K + 1, last_try)// 
 
 11 
 
 11 
 
 end MODIFY 
 
 Figure 5.4 - MODIFY routine, version 2 
 
 it was in Figure 5.3, except that it returns whether or not the re- 
 parse attempt was successful. Also, the transit function now 
 accepts a general option as its first parameter, instead of just 
 a simple token option. 
 
 Two other changes have been made to this MODIFY routine; both 
 of these changes have to do with special cases of applying the 
 "A," "R," and "S" operations. The "S" operation restriction is 
 ^ery easy to understand. Since no right context is available when 
 an error is originally detected by the parser, it is true that a 
 wery trivial (though "possible") way of correcting the program is 
 to suppress the last token that was input; however, this does not 
 provide any more information for the programmer than was already 
 known—that is, the last input token is unacceptable. To avoid 
 this trivial suggestion, the "S" operation is not applied if the 
 input location L (where the correction is to be made by the MODIFY 
 routine) is the same as the location of the actual original error 
 
70 
 
 token. This restriction prevents the diagnostic system from "find- 
 ing" this "possible correction." 
 
 The second change has to do with the "A" and the "R" op- 
 erations. Assume that the input location L is the same as that of 
 the actual original error token. Then note that if an "A" opera- 
 tion is successful (that is, if some option can be successfully 
 added to the input string in front of the error token), then it is 
 also true that the "R" operation will succeed (replacing the error 
 token with the option will yield a successful interpretation); this 
 is again because of no right context being available for analysis. 
 Therefore, to avoid nearly duplicate suggestions from being compu- 
 ted by MODIFY, if the "A" operation is successful for some general 
 option with respect to the error token, then the "R" operation for 
 that general option is not performed. 
 
 In summary, both of these special cases occur because of 
 no right context being available in the input string. This forces 
 slightly different considerations to be given to any changes that 
 are attempted with respect to the location of the actual original 
 error token in the input string. 
 
 5.3.2 Providing Programmer Control of Messages 
 
 A typical technique used to construct the general option 
 list for a given parser state is to use a normal depth-first back- 
 tracking algorithm. This algorithm will begin by starting to 
 examine each labeled branch for the original parser state. If the 
 
71 
 
 label on a branch is a simple token specification, then that token 
 can be added to the general option list, and the next branch 
 examined. However, if the label is a nonterminal specification, 
 then first that nonterminal is added to the general option list 
 and then the branches for the entry state of the nonterminal are 
 each examined in a similar fashion. When all of the branches for 
 a particular state have been examined, the algorithm must backtrack 
 to the state and branch where that entry state was invoked. The 
 algorithm terminates when all of the branches for the original 
 state have been examined. 
 
 If this depth-first technique is used, it is possible to 
 provide some organization for the entries in the general option 
 list. In particular, it is possible to view the general option 
 list as a tree structure; the root node of the tree represents the 
 original parser state Q; each son of the root node represents a 
 different parser option for the state Q; finally, each of these 
 sons that represent a nonterminal option as opposed to a simple 
 token option is the root of a subtree that contains the further re- 
 fining detail for that higher-level option. All of the leaves of 
 the tree are simple token option nodes and all of the internal nodes 
 in the tree are nonterminal nodes. This tree will be called the 
 general option tree . 
 
 Using this general option tree, the MODIFY routine has a 
 choice about which general option to use for its next loop 
 
72 
 
 iteration: it can either proceed breadth first (across the initial 
 sons of each node in the general option tree), or depth first (ex- 
 hausting each son and corresponding subtree in the general option 
 tree before proceeding to the following son). To avoid confusing 
 the programmer, it seems intuitively best to use the depth-first 
 order of selecting the next option. This has the effect of provid- 
 ing suggestions initially about a higher-level construct (nonter- 
 minal), with successive suggestions providing more and more detail 
 about the high-level construct, until the detail reaches the actual 
 token level. Note that since each node in the tree is actually a 
 particular parsing option (all that the tree structure does is 
 relate some of the options together), even if a modification at- 
 tempt fails using a nonterminal option at a node high in the tree, 
 an attempt with a descendent of that node may succeed; it is neces- 
 sary to examine all the nodes in the tree to determine all possible 
 corrections to the program. When all of the detail farther down 
 the tree has been examined for one option (son) of a node, then 
 another, different option (son) can be selected for that node and 
 the depth-first examination resumed. When all of the options have 
 been examined for a node, then the control moves one node farther 
 back up the tree and the process is repeated. 
 
 With the general option tree structured in this way, it 
 is possible for the programmer to control the amount of detail that 
 is presented in some of the suggestion messages. If a message is 
 
73 
 
 presented to the programmer that suggests that Adding or Replacing 
 something in the program with a nonterminal will yield a correct 
 interpretation, then the programmer can request that more detail be 
 presented about the same suggestion message (if available); on the 
 other hand, the programmer may indicate that no more detail is 
 wanted about that suggestion message, but instead some other unre- 
 lated suggestion should be presented (if possible). As long as 
 the programmer requests the more detailed suggestion messages, the 
 MODIFY routine can just proceed with trying general options in the 
 normal depth-first order. However, if a request for a "different" 
 suggestion message is made by the programmer and the leaf (token) 
 node has not been reached for the current path, then the depth- 
 first order must be abandoned, and instead a new option (son) for 
 the current node examined (i.e., move breadth-first instead of 
 depth-first). 
 
 To handle the depth-first tree examination properly, the 
 MODIFY routine has been divided into two routines: the new MODIFY 
 routine creates the general option tree, invokes the new M0D_TREE 
 routine to try the Add and the Replace operations in a depth-first 
 order for the general option tree, and finally tries the Suppress 
 operations. The M0D_TREE routine will move depth-first through the 
 option tree unless the programmer requests that the detail about 
 the current suggestion be restricted. 
 
 The final forms for the USERJNTERACTION, REPARSE, MODIFY 
 and M0D_TREE routines are given in Figures 5.5, 5.6, 5.7, and 5.8, 
 
74 
 
 respectively. These algorithms incorporate the changes to provide 
 the programmer with control over the amount of detail that is pre- 
 sented in successive suggestions. 
 
 begin USERJNTERACTION (request) 
 
 comment request is an output parameter. It is set to either 
 'successful-provide detail' or 'successful-restrict 
 detail, 1 depending upon the programmer's keypress 
 response. 
 
 comment USER_I INTERACTION displays a new suggestion message on 
 the screen and then waits for the programmer to respond. 
 If the programmer requests another suggestion message, 
 then "request" is set and the routine returns normally. 
 If the programmer responds with any other compiler edit- 
 ing key, then this routine forces the termination of the 
 error analysis process, and exits directly into the com- 
 piler's editor system. 
 
 comment Display new message. 
 
 //erase : current message on screen// 
 
 //display : new message on screen// 
 
 comment "Read-edit-key" waits until an editing key or one of 
 the "help request" keys is pressed; the key- type is 
 returned in "Read-key." 
 
 /Read-edit-key/ 
 
 if Read-key = Compiler-editing-key then 
 
 /reestablish program editing environment/ 
 
75 
 
 /exit from error system to the compiler's editor/ 
 
 11 
 
 comment The programmer has requested another suggestion. 
 
 The normal request is the 'HELP' key, which requests that 
 more detail be provided (if possible). The alternate 
 request is to restrict detail in the next message. 
 
 request +■ if Read-key = 'HELP' then 'successful -provide 
 detail' 
 else 'successful-restrict detail' fi_ 
 
 end USERJNTERACTION 
 
 Figure 5.5 - Final USERJNTERACTION routine 
 
 begin REPARSE (LR, QR, KR, user_request) 
 
 comment LR is reparse start input location index, 
 
 QR is reparse start state for location LR, 
 
 KR is number of modifications made to input string, 
 
 user_request is an output parameter that specifies 
 
 whether the reparsing is successful or not. 
 
 /userjrequest -*- 'unsuccessful'/ 
 
 comment error_token_location is input location of the 
 
 originally detected error token. 
 comment Is reparse attempt successful? 
 11 LR > error_token_location then 
 if KR = N then 
 
 comment Reparsing was successful. Invoke routine 
 USER INTERACTION, which will either return 
 
76 
 
 with user_request set indicating the amount 
 
 of detail the programmer wants, or else 
 
 will cause the error system process to be 
 
 terminated without returning normally. 
 
 //USERJNTERACTION (user_request)// 
 
 f_l 
 
 else 
 
 comment Extend reparse attempt 1 token if next token 
 
 is acceptable. 
 
 if transit (token (LR), QR) f "null" then 
 
 //REPARSE (LR + 1, transit (token (LR), QR), KR, 
 
 user request)// 
 
 fi 
 
 comment Make another modification at location LR, if 
 possible, to build all legal interpretations, 
 if KR < N then 
 
 //MODIFY (LR, QR, KR)// 
 fj_ 
 fi. 
 end REPARSE 
 
 Figure 5.6 - Final REPARSE routine 
 
77 
 
 begin MODIFY (L, S, K) 
 
 comment L is the input location pointer (indexed from the 
 beginning of the input string), 
 Q is the parser state number for token at location L, 
 K is the number of modifications that have been made 
 prior to current activation. 
 comment OPT routine constructs the general_option_tree. 
 
 It is a tree structure, with descendents of a node repre- 
 senting further refinements or detail for the node. 
 //OPT: build general_option_tree for state Q with root 
 
 'root'// 
 /Lnt *■ location in input following the nonterminal that 
 begins with token at location L; if no nonterminal 
 begins there, set Lnt «- 0/ 
 comment Try Add, Replace operations for sons of root of tree. 
 //M0D_TREE (L, Q, K, Lnt, root)// 
 comment Prune special case for Suppress operation. 
 if L f error_token_location then 
 
 comment Token Suppress operation. 
 //REPARSE (L + 1, Q, K + 1, last_try)// 
 comment Does nonterminal exist? 
 if Lnt f then 
 
 comment Nonterminal Suppress operation. 
 //REPARSE (Lnt, Q, K + 1, last try)// 
 
78 
 
 f± 
 
 end MODIFY 
 
 Figure 5. 7 - Final MODIFY routine 
 
 begin M0D_TREE (LD, QD, KD, LntD, Droot) 
 comment LD is the input location pointer, 
 
 QD is the parser state number for token at location 
 
 LD, 
 KD is number of previous modifications, 
 LntD is location in input following nonterminal or 
 
 if no nonterminal exists, 
 Droot is the root of the (sub) tree in general_ 
 option_tree. 
 comment M0D_TREE tries Add and Replace (token and nonterminal) 
 
 operations for each son (option) of Droot. 
 comment REPARSE routine's 4th parameter is set to 
 
 1) 'unsuccessful' : if previous modification did 
 
 not completely "correct" input string, or 
 
 2) 'successful-provide detail ' or 
 'successful -restrict detail' : if previous 
 
 modification "corrected" input string, but 
 programmer requested another suggestion. 
 for all g e sons (Droot) loop : • 
 
79 
 
 comment Add operation. 
 
 //REPARSE (LD, transit (g, QD), KD + 1, last_try)// 
 
 comment Is more detail requested? 
 
 if last_try = 'successful -provide detail' then 
 
 comment Give more detail: move depth-first through 
 
 tree. 
 //M0D_TREE (LD, QD, KD, LntD, g)// 
 f_l 
 
 comment Prune special case for Replace operation. 
 if (LD f error_token_location or last_try = 'unsuccess- 
 ful ' ) and last_try f 'successful-restrict detail' 
 then 
 
 comment Token Replace operation. 
 //REPARSE (LD + 1, transit (g, QD), KD + 1 , 
 
 last_try)// 
 comment is more detail requested? 
 if last_try = ' successful-provide detail' then 
 comment Give more detail: move depth-first 
 
 through tree. 
 //M0D_TREE (LD, QD, KD, LntD, g)// 
 fi_ 
 comment Does nonterminal exist? 
 
80 
 
 if LntD f and last_try f 'successful -restrict 
 detail' 
 then 
 
 comment Nonterminal Replace operation. 
 //REPARSE (LntD, transit (g, QD), KD + 1, 
 
 lastjry)// 
 comment Is more detail requested? 
 if last_try = 'successful-provide detail' then 
 comment Give more detail: move depth- 
 first through tree. 
 //M0D_TREE (LD, QD, KD, LntD, g)// 
 
 11 
 
 11 
 
 11 
 repeat 
 
 end M0D_TREE 
 
 Figure 5.8 - Final M0D_TREE routine 
 
 5.3.3 Example using the Extended Model 
 
 As a comparative example of the error interpretations 
 that can be determined using the Extended Diagnostic Model that has 
 been described rather than the preliminary Token Model, consider 
 again the transition diagrams E2 and E3 given in Figure 3.3. Table 
 5.5 shows the different general option trees that can be computed 
 
81 
 
 for each state in the diagrams. Notice the nonterminal reference 
 for state S2, followed by the more detailed token options for that 
 state. 
 
 STATE GENERAL OPTION TREES 
 
 SI 
 
 S2 
 
 53 "d" 
 
 54 "e" 
 
 55 "c" 
 
 56 ^ "f" 
 
 - „ b „ 
 
 57 "g" 
 
 58 "c" 
 
 59 "d" 
 
 S10 none 
 
 Table 5.5 - General Options for States of Transition 
 Diagrams E2 and E3 
 
 Table 5.6 shows a few unacceptable input strings for the 
 
 diagrams E2 and E3, along with some possible error interpretations 
 
 for those strings and a comparison indicator for the preliminary 
 Token Model . 
 
82 
 
 o 
 
 CO 
 CO 
 
 CO 
 UJ LiJ 
 _I CD 
 Q. CD 
 
 s: =3 
 «=C co 
 co = 
 
 UJ >- _l 
 CD CO UJ 
 
 <=t Q 
 CO Q O 
 
 co z s: 
 
 UJ Z3 
 
 s o 2: 
 
 U- UJ 
 
 <uo 
 
 UCQh 
 
 LxJ 
 
 O 
 
 «=c 
 u_ s 
 o 
 
 CO 
 
 a: uj 
 
 UJ CD 
 
 Si < 
 
 z o 
 
 +-> 
 
 c 
 o 
 
 s- 
 
 co 
 
 s- 
 
 O) 
 
 to 
 
 c 
 
 c 
 o 
 
 en 
 j= j= = 
 
 ■4-> -M 
 
 •r- •«- T3 
 
 2 * S: 
 
 CD 
 U 
 fO 
 
 Q. 
 CD 
 
 O) 
 
 u 
 
 Q. 
 
 = -o 
 4- = 
 
 4- 
 -»-> o 
 
 s~ 
 CD 
 to 
 
 c 
 
 
 *» Z 
 
 
 
 • 
 
 
 
 = 
 
 
 CO 
 
 4- JO 
 
 
 UJ 
 
 O = 
 
 — 
 
 
 4-> CD 
 
 J3 
 
 m 
 
 C -O 
 O 
 
 = 
 
 JZ 
 
 s- - 
 
 J= 
 
 4-> 
 
 4- CO 
 
 -»-> 
 
 •r- 
 
 UJ 
 
 •C" 
 
 5 
 
 c - 
 
 •r— 
 
 s 
 
 3 
 
 C7t 
 
 z 
 
 O 
 
 = C 
 
 <o 
 
 = 
 
 
 = 
 
 CD 
 
 to 
 
 CD 
 
 <J 
 
 +-> E 
 
 O 
 
 rO 
 
 s- 
 
 «d 
 
 p— 
 
 CD 
 
 r~" 
 
 Q. 
 
 to 
 
 Q. 
 
 CD 
 
 c 
 
 CD 
 
 a: 
 
 t— 1 
 
 oc 
 
 
 J3 • 
 
 
 = 
 
 
 4- JD 
 
 = 
 
 O = 
 
 cu 
 
 +j 0) 
 
 
 C -Q 
 
 
 O 
 
 JZ 
 
 $- - 
 
 -t-> 
 
 4- co 
 
 •r— 
 
 UJ 
 
 2 
 
 c - 
 
 
 •r— 
 
 - 
 
 O) 
 
 ■0 
 
 = c 
 
 
 (O T" 
 
 
 = -^ 
 
 CD 
 
 fO 
 
 U 
 
 +-> E 
 
 <a 
 
 s_ 
 
 1— 
 
 CD 
 
 Q- 
 
 «/) 
 
 CD 
 
 c 
 
 cc 
 
 ►—I 
 
 o 
 
 c 
 
 to to to 
 CD CD CD 
 
 >> >» >> 
 
 (0 
 
 O 1- 
 C E 
 
 CD 
 
 c/1 
 
 S- 
 
 to 
 
 VI 1 — 
 
 CD t- 
 
 >> E 
 
 •p- 
 
 CO 
 UJ 
 
 •o 
 
 c 
 
 (O 
 
 CM 
 UJ 
 
 to 
 
 E 
 as 
 i- 
 CJ> 
 
 to 
 
 c 
 res 
 
 s- 
 o 
 
 1— 1— r- CVJ 
 
 i— CM 
 
 UJ 
 
 - 
 
 
 
 T3 
 
 
 
 
 
 -O 
 
 CtL 
 Q- 
 
 ce: a: 
 
 UJ 
 
 T3 
 
 CO 
 
 UJ 
 
 4- 
 
 Xi 
 
 4- 
 
 CO 
 
 UJ 
 
 
 ~o 
 
 ~<D 
 O 
 
 O 
 
 on ^— 
 en z: 
 
 fO 
 
 fO 
 
 (O 
 
 (O 
 
 CO 
 
 (O 
 
 J3 
 
 J3 
 
 <a 
 
 cd 
 
 1—4 
 
 \— 
 
 CO 
 
 o z: 
 
 to 
 
 c 
 o 
 
 •r- 
 
 +J 
 (O 
 
 +-> 
 CD 
 
 $- 
 Q. 
 S- 
 CD 
 •P 
 C 
 
 s_ 
 o 
 
 t- 
 s- 
 
 UJ 
 
 -a 
 
 CD 
 
 •o 
 
 c 
 
 CD 
 +-> 
 X 
 UJ 
 
 CD 
 
 -Q 
 «t5 
 
 -a 
 
 o 
 
 CO 
 
 O 
 
 J3 
 
83 
 
 As another example of some of the suggestions that can be 
 made using the extended suggestion model, consider transition dia- 
 grams E6 through E9, shown in Figure 5.9. The acceptable input 
 strings for these diagrams are: 
 
 "a c b a e" 
 
 "a c d e" 
 
 "a d e" 
 
 "b a c e b a" 
 
 "b a c e d" 
 
 "d c e b a" 
 
 "d c e d" 
 
 The general option trees that can be computed for each 
 state in the diagrams are shown in Table 5.7. Note that for some 
 of the states in this table, particularly states S16 and S17, 
 the general option tree depends on not only the transition diagram 
 itself, but also on the parser's stack at the time when the tree 
 is constructed. Each of these states is a final state for one of 
 the transition diagrams, and therefore it is necessary to know from 
 where that diagram was invoked in order to build the tree properly. 
 
 Table 5.8 shows a few unacceptable input strings for the 
 diagrams E6 - E9, along with some possible error interpretations 
 for those strings and some sample suggestion messages for these 
 interpretations. 
 
84 
 
 <E6> 
 
 "*" ^ <E9> . Kl rS4Vl^ fs7^ " e " fSIO 
 
 Rl 
 
 ^S5 ^ " e " 
 
 -MS> 
 
 "e" 
 
 S6 
 
 <E8> 
 
 <S)_ 
 
 <E7> 
 
 <E8> 
 
 sn >- (si 2 
 
 S13 
 
 E8> 
 
 S15 
 
 S17 
 
 <E9^ 
 
 "d" 
 
 S16 
 
 S19 
 
 S20 
 
 Figure 5.9 - Transition diagrams E6 - E9 
 
85 
 
 STATE 
 
 GENERAL OPTION TREES 
 
 SI 
 
 E7> - <E8>< 
 
 S2 
 
 S3 
 S4 
 
 <E9> 
 
 "e" 
 
 <E8> 
 
 "c" 
 
 "b" 
 "d" 
 
 S5 
 S6 
 
 S7 
 
 S8, S9, SIO 
 
 Sll 
 
 SI 2 
 S13 
 
 S14 
 
 S15 
 S16, S17 
 
 "e" 
 
 <E8> 
 
 "e" 
 
 none 
 <E8> 
 
 "c" 
 
 "e" 
 
 "b" 
 "d" 
 
 "a" 
 
 "e" 
 
 none 
 
 "c" 
 
 "b" 
 "d" 
 
 "b" 
 
 "d" 
 
 called from state SI 
 
 : called from state S4. 
 : called from state S6. 
 : called from state Sll 
 
86 
 
 STATE 
 
 GENERAL OPTION TREES 
 
 SI 8 
 
 "c" 
 
 "d 1 
 
 S19 
 
 <E8> <^ : called from state S2, 
 ^ "d" 
 
 S20 
 
 "e" 
 
 : called from state S2. 
 
 Table 5.7 - General Options for Diagrams E6 - E9 
 
87 
 
 CU 
 
 A 
 CT> 
 
 CJ3 
 
 <c 
 
 co 
 co 
 
 CO 
 
 LU LxJ 
 
 _i as 
 
 O. CJ 
 
 <: 00 
 10 = 
 
 o 
 o 
 
 CX UJ 
 
 qq : 
 2: < 
 
 rs in 
 2: o 
 
 *4- 
 
 o 
 
 o 
 
 O) 
 
 4- 
 
 A 
 
 4-> 
 
 
 4- 
 
 
 O 
 
 1^- 
 
 C 
 
 s 
 
 O 
 
 e 
 
 
 UJ 
 
 O 
 
 • 
 
 
 ■1" 
 
 4-> 
 
 V 
 
 i. 
 
 -Q 
 
 +J 
 
 
 C 
 
 
 4- 
 
 z 
 
 e 
 
 r 
 
 O 
 
 
 
 
 
 
 to 
 
 s- 
 
 -C 
 
 C 
 
 x 
 
 S- 
 
 a 
 
 4- 
 
 +-> 
 
 •r— 
 
 4-> 
 
 4- 
 
 
 
 • ^ 
 
 
 •r— 
 
 
 •O 
 
 E 
 
 3 
 
 z 
 
 s 
 
 e 
 
 e 
 
 •1— 
 
 
 A 
 
 
 •1— 
 
 (O 
 
 
 z 
 
 CO 
 
 z 
 
 
 
 z 
 
 U 
 
 LU 
 
 u 
 
 z 
 
 z 
 
 (O 
 
 = 
 
 V 
 
 s 
 
 "O 
 
 X 
 
 
 O) 
 
 
 O) 
 
 
 
 •M 
 
 
 
 •4-> 
 
 O 
 
 +J 
 
 +-> 
 
 S- 
 
 <o 
 
 J- 
 
 to 
 
 S- 
 
 &_ 
 
 O) 
 
 t— 
 
 (U 
 
 r— • 
 
 <v 
 
 CD 
 
 to 
 
 Q. 
 
 (/) 
 
 Cl 
 
 CO 
 
 to 
 
 c 
 
 0) 
 
 c 
 
 Oi 
 
 e 
 
 e 
 
 t-H 
 
 cx 
 
 1— 1 
 
 cx 
 
 h— 1 
 
 t-H 
 
 CM 
 
 4- 
 
 
 
 
 a) 
 
 
 «o 
 
 
 
 
 4- 
 
 = 
 
 s 
 
 = 
 
 ■M 
 
 
 
 4- 
 
 T3 
 
 4- 
 
 E 
 
 s 
 
 cz 
 
 O 
 
 z 
 
 O 
 
 O 
 
 • 
 
 •r— 
 
 
 
 
 S- 
 
 X 
 
 
 4-> 
 
 X 
 
 ■«-> 
 
 4- 
 
 z 
 
 z 
 
 E 
 
 4-> 
 
 E 
 
 
 
 to 
 
 O 
 
 •^ 
 
 O 
 
 e 
 
 JZ 
 
 z 
 
 J- 
 
 3 
 
 s- 
 
 •r— 
 
 +J 
 
 
 4- 
 
 
 4- 
 
 
 •^ 
 
 ■O 
 
 
 z 
 
 
 S 
 
 2 
 
 E 
 
 E 
 
 A 
 
 C 
 
 A 
 
 
 »a 
 
 •^ 
 
 tr> 
 
 •^ 
 
 CO 
 
 z 
 
 
 
 LU 
 
 
 LU 
 
 <v 
 
 z 
 
 z 
 
 V 
 
 z 
 
 V 
 
 - 
 
 X 
 
 ■O 
 
 = 
 
 X 
 
 
 <u 
 
 
 
 CD 
 
 
 +J 
 
 u 
 
 +J 
 
 4-> 
 
 
 
 +-> 
 
 i. 
 
 <a 
 
 S- 
 
 s- 
 
 <o 
 
 S- 
 
 CD 
 
 r— 
 
 <u 
 
 CD 
 
 r— 
 
 CU 
 
 to 
 
 Q. 
 
 to 
 
 to 
 
 Q. 
 
 to 
 
 e 
 
 CU 
 
 £ 
 
 c 
 
 cu 
 
 E 
 
 1— 1 
 
 cx 
 
 »— 1 
 
 ^^ 
 
 cx 
 
 t—t 
 
 c— C\J 1— 
 
 4-> 
 
 c 
 o 
 
 S- 
 
 <u 
 
 o = 
 
 A 
 +J CT> 
 
 E LU 
 
 O V 
 
 $- = 
 
 4- 
 
 T3 
 E E 
 
 •1- fO 
 
 «J 
 
 ■M 4-> 
 
 S- S- 
 
 <L) CU 
 
 to to 
 
 E C 
 
 C\J 
 
 CT> 
 
 to 
 
 LU 
 
 
 o 
 
 -p 
 cu 
 
 Q. 
 
 s- 
 cu 
 +-> 
 
 E 
 
 S- 
 
 o 
 
 s- 
 
 s- 
 
 LU 
 
 -o 
 
 CU 
 
 -o 
 
 E 
 (U 
 +-> 
 X 
 
 CX 
 
 D_ 
 
 CX CX 
 
 O LU 
 
 CX I 
 
 cx : 
 
 o 
 
 A 
 1^ 
 
 LU 
 V 
 
 U 
 A 
 
 CO 
 
 o 
 ■o 
 
 u 
 
 to 
 
 XI 
 
 <u 
 
 
 
 CU 
 
 
 
 
 A 
 
 
 
 z 
 
 
 z 
 
 00 
 
 
 to 
 
 CD 
 
 
 CU 
 
 LU 
 
 z 
 
 
 
 z 
 
 
 V 
 
 XI 
 
 XI 
 
 -O 
 
 <U 
 
 
 
 CJ 
 
 U 
 
 
 
 O 
 
 ■O 
 
 (TJ 
 
 to 
 
 to 
 
 to 
 
 to 
 
 to 
 
 
 0) 
 
 
 to 
 
 
 X 
 
 <D 
 
 
 
 A 
 
 O 
 
 CT> 
 
 
 LU 
 
 (0 
 
 V 
 
 to 
 
 <u 
 
 o 
 
 CO 
 
 CO 
 LO 
 CU 
 X 
 
 to 
 
 o 
 
 cx 
 co 
 
 o ; 
 
 CU 
 
 o 
 
 (0 
 
 CU 
 
 to 
 
 X 
 
88 
 
 5.4 Overall Control of the Extended Model 
 
 One final problem concerning the overall control and 
 operation of the error system has yet to be completely discussed. 
 The MODIFY routine, which essentially controls the entire error 
 analysis process, is always defined in terms of some starting input 
 token location and corresponding parser state. The REPARSE routine, 
 if it recursively invokes the MODIFY routine, does so with the 
 location and state parameters updated for the location of the next 
 modification that must be attempted. However, the overall error 
 system's initial invoking of MODIFY, although briefly mentioned in 
 section 5.2.2, is as yet unspecified. 
 
 The algorithm that is proposed is to make the initial 
 activation of the MODIFY routine at the actual original error 
 token location and the state of the parser where the original error 
 was detected. This corresponds to the following: 
 
 //MODIFY (error_token_location, corresponding_state, 0)//. 
 
 This initial call assumes that the error token itself is 
 in error, and the MODIFY routine's analysis will generate sugges- 
 tions about exactly what the parser was expecting when the error 
 was detected. Since all modifications will be tried at the very 
 end of the input string, any Replace operation that is tried will 
 yield a "corrected" interpretation immediately. It is useful to 
 
89 
 
 have the first "possible correction" suggestions provide informa- 
 tion about the state of the parse when the error was detected. 
 
 After all possibilities have been tried from this 
 initial location, the MODIFY routine will return to the error_- 
 system_controller. Then the controller should reinvoke the MODIFY 
 routine from a location and corresponding parser state that is one 
 token back in the input string. Likewise, each time that the 
 MODIFY routine exhausts all possibilities from one location, the 
 controller should move back another token and reactivate the MODIFY 
 routine. Each of these reactivations makes the assumption that the 
 original error token is itself "correct," and that the actual syn- 
 tax error occurred someplace back in the input string. All of the 
 modifications are made to things located before the original error 
 token, which is not disturbed, but is left intact at the right end 
 of the input string. 
 
 As mentioned in section 5.2.2, the backup process should 
 terminate when the left context "beacon" is reached. 
 
 There is one more important parameter that must be con- 
 trolled by the error analysis system: that is the upper limit of 
 modifications, N, that can be made while constructing any error 
 interpretation. This limit is checked in the REPARSE routine when 
 the decision about allowing another modification to be made must be 
 resolved. There are two approaches that can be taken to control N. 
 One is to compute all suggestion messages from one location with 
 
90 
 
 1, 2, . . ., N modifications allowed (for some predetermined N), 
 and then to move back a token and repeat the process. The other 
 approach is to compute all error interpretations of the input 
 string with one modification, then compute all interpretations 
 with two modifications, etc. 
 
 This thesis proposes that the second approach be utilized 
 for the following reason. In the interactive compiling environment 
 described in Chapters 1 and 3, it was noted that as the programmer 
 enters the program, it is syntactically parsed, and errors are 
 signalled as soon as detected. This implies that it is likely that 
 the actual input string is fairly "close" to being correct when an 
 error is detected. Therefore, it seems better to compute the 
 "close" error interpretations first (i.e., the N limit equalling 
 one, then two, etc.), before attempting to find an interpretation 
 with a larger number of required modifications. 
 
 Figure 5.10 gives the ERR0R_SYSTEM_C0NTR0L routine. This 
 routine is the error system module that is invoked by the parser 
 system (SYNA) when the original error is detected. It controls the 
 number N of modifications that are allowed in any reparse attempt 
 and also where each reparse attempt originates. Then each reparse 
 attempt that is originated is successful only if the input string 
 can be corrected with exactly N modifications; the MODJFREE, MODIFY, 
 REPARSE, and USER_INTERACTION routines are still as given in 
 Figures 5.8, 5.7, 5.6, and 5.5, respectively. 
 
91 
 
 begin ERROR_SYSTEM_CONTROL 
 
 comment This routine is invoked by SYNA when a parser syntax 
 error is originally detected. The routine provides 
 control over the initiation of reparse attempts, and 
 the number of modifications allowed in any interpre- 
 tation. 
 /maximum_N «- predetermined maximum number of allowed changes 
 
 in any interpretation/ 
 /establish the error environment/ 
 /beacon_location + farthest token in the input that will be 
 
 backed-up to/ 
 loop for N = 1 step 1 to maximum_N : 
 
 comment set up initial invoking locations. 
 
 loc *■ error_token_location 
 
 state +■ parser state associated with error_token_location 
 
 loop : 
 
 comment Initiate reparse attempt. 
 //MODIFY (loc, state, 0)// 
 while loc > beacon_location : 
 
 comment Initiate reparse attempt from new location 
 /back up 1 token in the input string/ 
 /loc «- new input location/ 
 /state <- corresponding state/ 
 
 repeat 
 
92 
 
 repeat 
 
 comment No more help is available. 
 
 /Report to user that no more help is available/ 
 
 /reestablish the compiler's editing environment/ 
 
 end ERR0R_SYSTEM_C0NTR0L 
 
 Figure 5.10 - ERROR SYSTEM_C0NTR0L routine 
 
93 
 
 Chapter 6 
 CONCLUSIONS AND FUTURE RESEARCH 
 6.1 Summary 
 
 This thesis describes an interactive compiler diagnostic 
 system. It is assumed that the compiler will be used by elementary 
 programmers, and the responsibility of the diagnostic system is to 
 behave like a "consultant" to these programmers with respect to 
 syntactic errors. It is important that this consulting system be 
 very easy to use and interact with so as to avoid confusing or 
 frustrating the programmer. In addition, it is important that any 
 diagnostic messages that are given to the programmer be very 
 clearly stated using terminology with which the programmer is 
 familiar. 
 
 The diagnostic consulting system communicates with a 
 programmer about an error by suggesting "possible, corrections" that 
 can be made to the program to yield a valid syntactic prefix in 
 the programming language. These suggestions are determined by 
 examining the program input string, the parser's internal state 
 information, and the parser's table; this implies a programming- 
 language-independent diagnostic system. The only information re- 
 quired by the diagnostic system in addition to this normal compiler 
 information is an "English phrase" table which provides readable 
 
94 
 
 English character strings that can be used to describe various 
 pieces of parser information to the programmer. 
 
 The system operates by successively moving farther and 
 farther to the left of the position in the input string where the 
 original error was detected; at each successive position, a 
 modification to the input string is made (replacement, addition, or 
 suppression ) andthe remainder of the input up to and including 
 the original error token is reparsed. If no new errors are de- 
 tected during this reparsing, then the modified input string is 
 syntactically correct and a "possible correction" suggestion mes- 
 sage can be given to the programmer that simply describes what 
 modifications were made. If a new error is detected during the 
 reparsing, then additional or different modifications must be made 
 to the input string and the reparsing attempted again. 
 
 Unlike previous systems, the modifications made to the 
 input string are concerned not only with simple tokens, but also 
 with any nonterminals that are acceptable at that position in the 
 parse. These nonterminals generally represent higher-level con- 
 structs in the programming language, and thus suggestion messages 
 can be given using this higher-level terminology. A further advan- 
 tage of these higher-level suggestions is that successive 
 suggestions will frequently provide more and more detail about the 
 particular nonterminal until the detail reaches the actual "token" 
 level; thus, the suggestions tend to at first provide higher-level 
 
95 
 
 information, followed by successively more refined and detailed 
 information for the programmer. It is very easy, in addition, to 
 provide the programmer with some control over the amount of detail 
 that is presented; the programmer can, at any time, request that 
 the successively refined suggestions not be presented, but that 
 other, different suggestions be given instead (if possible). 
 
 6.2 Evaluation 
 
 6.2.1 Implementation Performance 
 
 As mentioned in Chapter 1, one of the key objectives of 
 this research was to create the first interactive diagnostic con- 
 sulting system by actually implementing on the PLATO IV system the 
 algorithms that have been described. The entire diagnostic system 
 was written in TUTOR [She., '74], a Fortran-level language sup- 
 ported by PLATO. TUTOR is a very good language for input/output: 
 it contains very powerful facilities for accepting and examining 
 keyboard input from the interactive terminal and for preparing 
 arbitrarily complex graphics displays on the terminal's screen. 
 However, it is fairly primitive with respect to program control 
 constructs and memory management; in particular, it does not sup- 
 port dynamic storage allocation, and therefore recursion is not 
 available. Also the subroutine return-address stack is only ten 
 levels high, a limit that is easy to exceed in a complex TUTOR 
 
96 
 
 program. The implementation was completed, however, in approxi- 
 mately 6 months, of which about 3 months were full-time (summer) 
 and the remainder part-time. 
 
 Because of the nonrecursive limitation of TUTOR, the 
 actual prototype system was restricted somewhat from the algorithms 
 described in Chapter 5. The most important restriction is that N, 
 the number of modifications allowed in any interpretation, is 
 limited to the value of N = 1 . This means that after the first 
 modification has been made in a possible interpretation, the re- 
 mainder of the input string must be acceptable without any changes 
 for the reparse attempt to succeed. This removes the recursion 
 requirement for the MODIFY routine immediately. It also allows 
 the REPARSE routine to be written without recursion, since then 
 its only responsibility is to simply check whether any new errors 
 are signalled if the parser is started on the remainder of the 
 input string. If no error is signalled, the reparsing succeeds; 
 otherwise the single modification that was made did not correct 
 the string and REPARSE must return to MODIFY so that a different 
 modification can be tried. 
 
 The only disappointment has been the fairly slow response 
 time of the diagnostic system when another suggestion is requested. 
 Since determining suggestions requires examining the parser's table 
 and repeatedly reparsing the modified input string, the diagnostic 
 system is generally compute-bound. Unfortunately, the PLATO IV 
 
97 
 
 system is designed for input/output bound programs with time for 
 relatively few calculations. 
 
 The response time problem is not limited to the diagnostic 
 system, however; the entire compiler system exceeds the currently 
 suggested cpu usage limits of 2 - 4 ms/sec. The most recent 
 timing estimates indicate that the overall compiler runs wery well 
 at 7 - 8 ms/sec, and marginally well at 5 - 6 ms/sec. In compari- 
 son, the diagnostic system runs very well at 10 - 12 ms/sec, and 
 acceptably at about 8-10 ms/sec; the performance rapidly 
 deteriorates below these levels, however, and subjective tests in- 
 dicate that the diagnostic system is too frustrating and slow to be 
 useful with a cpu limit below 5-6 ms/sec. 
 
 An important observation about the operation of the diag- 
 nostic system is that the initial suggestion messages that are 
 given by the system are generally quite easy to determine and com- 
 pute since \/ery little context is examined (i.e., only the actual 
 token that originally caused the error to be signalled). There- 
 fore, the first few "possible correction" messages can be computed 
 for the programmer with a small cpu limit (around 4-5 ms/sec), 
 so that at least some information is available even when the PLATO 
 IV system is fairly heavily loaded. It is mostly when suggestions 
 concerning modifications farther back in the input string must be 
 computed and PLATO IV is heavily loaded that intolerable response 
 delays may occur. 
 
98 
 
 6.2.2 Effectiveness 
 
 At the present time, because of the relatively slow 
 response times mentioned above, the diagnostic system has not been 
 used by elementary programmers; therefore it is impossible to 
 report on the actual demonstrated effectiveness of the system. 
 It is possible, however, to compare a few of the general charac- 
 teristics of the consulting system with other diagnostic systems. 
 
 The diagnostic systems for the PLATO IV compilers were 
 originally designed as a set of ad hoe, hand-coded, one-line error 
 messages, determined by a specific error number that was obtained 
 from the parser's table. One specific error message was displayed 
 upon detection of an error in the user's program, and this message 
 usually consisted of describing the most likely token (in the 
 language implementor's opinion) that would be acceptable at that 
 point in the parse. The diagnostic system proposed in this thesis 
 is language independent (it does not use any special error numbers, 
 but determines messages by examining the productions in the parser's 
 table), and the first few messages to the programmer always men- 
 tion the tokens or nonterminals that could be used immediately at 
 the location of the error in the program. These initial messages 
 therefore will usually provide not only the same information that 
 was available using the hand-coded technique, but actually all of 
 the correction possibilities. Since these first few messages are 
 
99 
 
 easy to compute, as mentioned previously, the consulting diag- 
 nostic system seems to provide at least as effective (frequently 
 more effective) assistance as did the ad hoo systems. 
 
 To determine how many immediate correction possibilities 
 there are for a particular error situation, it is necessary to 
 examine the general option tree for the state in which the parser 
 detected the error. A study of the general option trees for vari- 
 ous parser states was made to discover some of the characteristics 
 of these trees. The small subset of PL/I that has been implemented 
 (assignment, DECLARE, GOTO, IF, DO, list I/O, CALL, and RETURN 
 statements) was used for the study. The detailed results are 
 given in Appendix I. It was found that about 65 percent of the 
 trees for the possible parser states contained only one or two 
 option nodes. About 25 percent of the trees contained nonterminal 
 option nodes, and the maximum height of the trees was four (indi- 
 cating up to three nonterminal option nodes in a tree). Roughly 
 65 percent of the trees had only one or two token option nodes; 
 however, there were also a number of states with eight to ten token 
 options (primarily states accepting numeric expressions) and 
 eighteen to twenty token options (for the beginning of a new state- 
 ment). 
 
 In interpreting this information, it is important to 
 remember that the statistics are a measure of the trees for most of 
 the possible parser states that can occur in the PL/I subset. 
 
100 
 
 However, in any actual PL/I program, some of the possible states 
 occur much more frequently than others. In particular, states 
 corresponding to the start of a new statement or states accepting 
 numeric expressions occur frequently. Therefore, the trees for 
 these frequently-encountered states will be used most often. Since 
 the trees for statements and expressions are among the largest 
 and highest for PL/I, the consulting system also has the most in- 
 formation available for these states. 
 
 If the consulting system is compared to the system of 
 Levy [Lev, '71], it is interesting to note that near the original 
 error token the consulting system can provide outstanding sugges- 
 tions. Not only are simple "token" modifications suggested (as is 
 done by Levy), but also appropriate nonterminal modifications are 
 given. Furthermore, these nonterminal messages are then successive- 
 ly refined down to the actual token level, with the programmer 
 having some control over this refinement process. These suggestions 
 seem to provide much better assistance to a programmer than just 
 the simple token messages provided by Levy. 
 
 As suggestions are determined about modifications farther 
 back in the input string, however, the two systems appear to pro- 
 duce quite similar suggestions. These suggestions concern only 
 simple token modifications; the consulting system is seldom able to 
 make suggestions involving nonterminals farther back in the input. 
 This can be explained by realizing that acceptable strings in a 
 
101 
 
 programming language are usually syntactically "far apart" or 
 dissimilar. If a nonterminal is recognized in the parsing of a 
 string, there is a high probability that that nonterminal was in- 
 tended by the programmer and that it is formed and located cor- 
 rectly. Therefore, if an error is encountered to the right of a 
 recognized nonterminal, it will seldom be possible to modify that 
 nonterminal and thereby obtain a legal interpretation. However, 
 modifying a single token before the nonterminal may cause the 
 tokens comprising the original nonterminal to be reparsed differ- 
 ently, yielding a legal interpretation. 
 
 6.3 Future Research 
 
 There are a number of areas that can be suggested for 
 future research. One project is to continue to work on and opti- 
 mize the TUTOR-coded implementation on PLATO IV. The implemented 
 system would be much more useful if it were a factor of two or 
 three times more efficient. Work is already underway by the PLATO 
 systems staff to modify some of the TUTOR commands to allow a much 
 more efficient implementation. There is no doubt, also, that the 
 current program code could be improved substantially. 
 
 Another area of research that definitely could be ap- 
 proached is to actually test the usefulness and effectiveness of 
 the diagnostic system with programmers. The system could be 
 statistically compared with other existing systems to determine the 
 
102 
 
 relative effectiveness of the systems for elementary programmers. 
 It would also be yery interesting to determine the usefulness of 
 the consulting system for advanced programmers. 
 
 A final research suggestion is to look at the problem of 
 explaining the effect of a "possible correction" suggestion on the 
 program. The proposed system is very good at describing different 
 ways to correct the program, but it relies on the programmer to 
 examine the suggestions and determine what the ramifications of 
 making a particular correction will be on the program, Further 
 work on the characterization and description of context-sensitive 
 semantic requirements would also be very useful in attempting to 
 explain the effect of a particular correction on a program. 
 
 6.4 Conclusion 
 
 The use of interactive computing systems is steadily 
 increasing in the computing world. It is anticipated that inter- 
 active, incremental compilers will become standard features of 
 future systems. This thesis has described one possible approach 
 to the handling of diagnostic assistance in this environment. It 
 is hoped that this work will help stimulate research in this area. 
 
103 
 
 LIST OF REFERENCES 
 
 [Aho., '72] : Aho, A. V. and Ullman, J. D. , The Theory of Parsing 
 Translation, and Compiling , Volume 1, Prentice-Hall, 
 Inc., 542 pp., 1972. 
 
 [Alp., '70] : Alpert, D. and Bitzer, D., "Advances in Computer 
 
 Based Education," Science , Vol. 167 (20 March, 1970), 
 pp. 1582-1590. 
 
 [Con., '63] : Conway, M. , "Design of a Separable Transition-Diagram 
 Compiler," CACM, Vol. 6 (July 1963), pp. 396-408. 
 
 [Con., '73] : Conway, M. , and Wilcox, T., "Design and Implementa- 
 tion of a Diagnostic Compiler for PL/I," CACM, 
 Vol. 16, No. 3 (March 1973), pp. 169-179. 
 
 [Cre., '70] : Cress, P., Dirksen, P., and Graham, J., Fortran IV 
 with Watfor and Watfiv , Prentice-Hall, Inc., 1970. 
 
 [Dav., '75] : Davis, A., An Interactive Analysis System For 
 
 Execution-time Errors , Ph.D thesis, Department of 
 Computer Science, University of Illinois, January 
 1975, Report # UIUCDCS-R-75-695. 
 
 [DeR., '71] : DeRemer, F. L., "Simple LR(k) Grammars," CACM, Vol. 
 14, No. 7 (July 1971), pp. 453 - 460. 
 
 [Ela., '75] : Eland, D. , Forthcoming Ph.D thesis on the GUIDE 
 
 information-retrieval system, to be published summer 
 1975. 
 
 [Flo., '63] : Floyd, R. W. , "Syntactic Analysis and Operator 
 Precedence," JACM 10 (July 1963), pp. 316 - 333. 
 
 [Gra., '73] : Graham, S. and Rhodes, S., "Practical Syntactic 
 
 Error Recovery in Compilers," Conference Record of 
 the ACM Symposium on the Principles of Programming 
 Languages, Boston, Mass., October 1973, pp. 52-58. 
 
 [Gri., '74] : Griffiths, M. , "LL(1) Grammars and Analyzers," 
 Compiler Construction , Bauer, F. and Eickel, J. 
 (eds), Springer-Verlag, New York, 1974, pp. 57-83. 
 
 [Iro., '63] : Irons, E. , "An Error Correcting Parse Algorithm," 
 CACM, Vol. 6 (Nov. 1963) pp. 669-673. 
 
104 
 
 [JaE., '73] : James, E. , and Partridge, D., "Adaptive Correction 
 of Program Statements," CACM, Vol. 16, No. 1 
 (Jan. 1973), pp. 27-37. 
 
 [JaL. , '72] : James, L., A Syntax-directed Error Recovery Method , 
 Masters thesis, Computer Systems Research Group, 
 University of Toronto, May 1972, # CSRG-13. 
 
 [LaF., '70] : LaFrance, J., "Optimization of Error Recovery in 
 Syntax-directed Parsing Algorithms," Sigplan 
 Notices 5 (Dec. 1970) pp. 2-17. 
 
 [LaF., '71] : LaFrance, J., Syntax-directed Error Recovery for 
 Compilers , Ph.D thesis, Department of Computer 
 Science, University of Illinois, 1971, # 459. 
 
 [Lei., '70] : Leinius, R., Error Detection and Recovery for Syntax 
 Directed Compiler Systems , Ph.D thesis, Computer 
 Science Department, University of Wisconsin, 1970. 
 
 [Lev., '71] : Levy, J., Automatic Correction of Syntax Errors 
 in Programming Languages , Ph.D thesis, Computer 
 Science Department, Cornell University, Dec. 1971, 
 Technical Report # TR 71-116. 
 
 [Lorn., '73] : Lomet, D. , "A Formalization of Transition-diagram 
 Systems," JACM, Vol. 20, No. 2, (April 1973), pp. 
 235-257. 
 
 [Lyo., '74] : Lyon, G., "Syntax-directed Least-errors Analysis for 
 Context-free Languages: A practical Approach," 
 CACM, Vol. 17, No. 1, (January 1974), pp. 3-14. 
 
 [Nie., '74] : Nievergelt, J., Reingold, E., and Wilcox, T., "The 
 
 Automation of Introductory Computer Science Courses,' 
 A. Gunther, et al . (eds), International Computing 
 Symposium 1973, North-Holland Publishing Co., 1974. 
 
 [Rho., '73] : Rhodes, S., Practical Syntactic Error Recovery for 
 Programming Languages , Ph.D thesis, Department of 
 Computer Science, University of California at 
 Berkeley, June 1973, Technical Report 15. 
 
 [She., '74] : Sherwood, B., The TUTOR Language , Computer-based 
 Education Research Laboratory and Department of 
 Physics, University of Illinois, Urbana, Illinois, 
 1974. 
 
105 
 
 [Tin., '75] : Tindall, M. , An Interactive Table-driven Parser 
 System , Masters thesis, Department of Computer 
 Science, University of Illinois, August 1975. 
 
 [Wil., '73] : Wilcox, T. , "The Interactive Compiler as a Consultant 
 in the Computer Aided Instruction of Programming," 
 Proceedings of the Seventh Annual Princeton Confer- 
 ence in Information Sciences and Systems, March 1973. 
 
 [Wir., '66] : Wirth, N. , and Weber, H., "EULER-A Generalization of 
 ALGOL and its Formal Definition," CACM, Vol. 9, 
 No. 1 (January 1966) pp. 13 - 25, and Vol. 9, No. 2 
 (February 1966) pp. 89 - 99. 
 
106 
 
 APPENDIX I 
 GENERAL OPTION TREE EXAMPLES 
 
107 
 
 This appendix contains the results of a study of the size 
 and shape of the general option tree for many of the possible pars- 
 ing states. The analysis was performed on the small subset of the 
 PL/I language that has been implemented on the PLATO IV system. 
 
 In each case, the information is presented as a particu- 
 lar input string and the corresponding option tree that is accept- 
 able at that point in the parse of that string. Obviously, only 
 some of the possible input strings could be examined, but a repre- 
 sentative sample was selected and analyzed. In all, a total of 61 
 possible input strings were examined. 
 
 To simplify the form of the option trees, two special 
 subtrees will be used: one for the start of a numeric expression 
 and the other for the start of a new statement. Whenever an expres- 
 sion or a statement is specified as a nonterminal in an option 
 tree, the respective special subtree will be indicated. 
 
108 
 
 Special Subtrees 
 Numeric Expression = expr 
 
 ID Numeric Array + 
 
 simple, Constant declared, 
 declared, numeric, 
 numeric. 
 
 SIN . . . LOG LENGTH . . . INDEX 
 
109 
 
 Statement = Stmt 
 
 ID 
 
 numeric, or 
 character. 
 
 DECLARE 
 DCL 
 
 GO GOTO IF STOP DO PUT GET CALL ENTRY RETURN 
 
no 
 
 INPUT 
 
 <null program> 
 
 OPTION TREE 
 
 © 
 
 ID 
 
 undeclared 
 
 TESTIO 
 
 TESTIO: 
 
 © 
 
 PROCEDURE 
 PROC 
 
 T : PROC 
 
 ( OPTIONS RETURNS 
 
 T : PROC ( 
 
 ID 
 
 undeclared 
 
Ill 
 
 INPUT 
 
 OPTION TREE 
 
 T : PROC (A, 
 
 T : PROC (A, 
 
 © 
 
 ID 
 
 undeclared 
 
 T : PROC (A) 
 
 RETURNS 
 
 T : PROC OPTIONS, 
 
 T : PROC OPTIONS ( 
 
 MAIN 
 
 T : PROC OPTIONS (MAIN, 
 
112 
 
 INPUT 
 
 T : PROC OPTIONS (MAIN) A 
 
 OPTION TREE 
 
 T : PROC RETURNS , 
 
 T : PROC RETURNS ( 
 
 FIXED FLOAT 
 
 CHARACTER 
 CHAR 
 
 T : PROC RETURNS (CHAR) 
 
 T : PROC ; 
 
 END 
 
 Stmt 
 
113 
 
 INPUT 
 
 OPTION TREE 
 
 ASSIGNMENT STATEMENTS 
 
 ID 
 
 numeric, or a 
 character 
 
 © 
 
 ID 
 
 numeric A 
 
 ID 
 
 character A 
 
 ID String Array "SUBSTR" ( 
 char Constant char 
 
114 
 
 INPUT 
 
 OPTION TREE 
 
 DCL A(10); 
 A A 
 
 DCL A(10); 
 A (5) A 
 
 ID 
 
 = 10, 
 
 numeric 
 
 binary 
 operator 
 
 LABELS 
 
 ID 
 
 label . ref , or A 
 undeclared 
 
115 
 
 INPUT 
 
 OPTION TREE 
 
 DECLARE STATEMENTS 
 
 DCL 
 
 DCL ( 
 
 undeclared 
 
 undeclared 
 
 DCL (A, 
 
 FIXED 
 
 CHARACTER 
 CHAR 
 
116 
 
 INPUT 
 
 OPTION TREE 
 
 DCL (A( 
 
 array 
 bound 
 
 expr 
 
 DCL (A) 
 
 FIXED FLOAT 
 
 CHARACTER 
 CHAR 
 
117 
 
 INPUT 
 
 OPTION TREE 
 
 GO Statements 
 
 GO 
 
 GOTO 
 
 TO 
 
 ID ID ID 
 undeclared Label label -reference 
 
 GOTO LI 
 
118 
 
 INPUT 
 
 IF Statements 
 
 OPTION TREE 
 
 IF, 
 
 IF I = 1 
 
 binary 
 operator 
 
 conditional 
 operator 
 
 THEN 
 
 IF I = 1 THEN 
 
119 
 
 INPUT 
 
 OPTION TREE 
 
 IF I = 1 THEN STOP: 
 
 A 
 
 STOP Statement 
 
 ELSE END 
 
 STOP 
 
 RETURN Statement 
 
 RETURN 
 
 RETURN ( 
 
120 
 
 INPUT 
 
 OPTION TREE 
 
 DO Statement 
 
 DO, 
 
 DO I 
 
 ID Array WHILE 
 numeric numeric 
 
 DO I = 
 
 DO I = 1 
 
 A 
 
 \ 
 
 binary 
 operator 
 
 TO BY WHILE 
 
121 
 
 INPUT 
 
 DO WHILE, 
 
 OPTION TREE 
 
 © 
 
 DO WHILE ( 
 
 PUT Statement 
 
 PUT, 
 
 PUT SKIP, 
 
 SKIP 
 
 PAGE 
 
 LIST 
 
122 
 
 INPUT 
 
 OPTION TREE 
 
 PUT PAGE, 
 
 LIST 
 
 PUT LIST 
 
 A 
 
 PUT LIST ( 
 
 Array 
 
 ID Constant Array 
 
 numeric, or numeric, numeric, or 
 
 character. or char 
 
 char \. 
 
 ( Numeric 
 
 Built-in 
 
 Functio 
 
 SIN 
 
 LOG 
 
 INDEX . 
 
 SUBSTR 
 
123 
 
 INPUT 
 
 OPTION TREE 
 
 PUT LIST (I 
 
 binary 
 operator 
 
 PUT LIST (I) 
 
 T 
 
 GET Statement 
 
 GET 
 
 LIST 
 
 GET LIST 
 
 GET LIST ( 
 
 GET LIST (I 
 
 ID 
 
 Array 
 
124 
 
 INPUT 
 
 CALL Statement 
 
 OPTION TREE 
 
 CALL 
 
 ID 
 
 undeclared 
 
 ID 
 
 external 
 
 CALL B. 
 
 CO 
 
 CALL B ( 
 
 G 
 
 \ 
 
 expr 
 
 CALL B (I 
 
 / \ 
 
 \ 
 
 CALL B (I) 
 
 binary 
 operator 
 
 i.O 
 
125 
 
 APPENDIX II 
 EXAMPLES OF "POSSIBLE CORRECTION" SUGGESTIONS FOR PL/I 
 
126 
 
 This appendix contains a number of examples of the 
 "possible correction" suggestions that are generated by the diag- 
 nostic system that has been implemented on PLATO IV. All of the 
 examples are taken from the subset of the PL/I language that is 
 in use by the interactive computer. 
 
 Each example is presented as a number of consecutive 
 diagrams or "frames." The person using the diagnostic system 
 would actually see only one frame at a time; each successive 
 frame would be displayed as the user pressed the appropriate 
 "HELP" or "Shifted-HELP" Key. The different frames for a par- 
 ticular error situation are presented on the same page in this 
 appendix only as a convenience to the reader. 
 
127 
 
 FILE « workspace PL/ 1 WORKSPACE (14- new) SPftCE = 284 
 
 ■' ,l • ■ r r r w r r r w r ' r w " r ■ r t ' k w r r r* 
 
 .ft: PPOC; 
 
 DCL.I .FIXED. . 
 
 ********* *ERROR** ******** (back) or (ehhse) to fix. 
 
 This undeclared variable is not permitted here. 
 Press (help) for more information. 
 
 FILE = workspace PL/ 1 UIORKSPftCE (1 4 -new) SPftCE = 284 
 
 v r f r v m r r y r v r v y r r v " r r» 
 
 A: .... . .PROC; 
 
 DCL.I .FIXED. . | 
 
 ********** Possible Correction ********** (back) or (erhse) to fi>. 
 
 Replace | | with an attribute. 
 
 Press (help) for a different suggestion. 
 Press GED^IkP) to see a legal attribute. 
 
 FILE = workspace PL/ 1 UORKSPftCE (14-new) SPftCE = 28 4 
 
 11 , 1 v v v r v . r r • v ? w •' r • r r r v " r r Ei 
 
 .ft: PROC; 
 
 DCL.I .FIXED. . § 
 
 ********** Possible Correction ********** (back) or (cruse) to fix. 
 RepUce J~3~n with_an attribute "DECIMALS _ 
 Press (help) for a different suggestion. 
 Press tima iHELP) to see another legal attribute. 
 
128 
 
 FILE = workspace PL/ 1 WORKSPACE (1 4- new) SPACE = 28 4 
 
 ' y ' * r'r. y r » r ' v r ' r t * " r v r 9 r ' -y 
 
 A: PROC; 
 
 DCL.I .FIXED. . .g 
 
 ♦»♦♦»♦♦»* ♦ Pos sible Correction ********** (Sack) or (Ebbs?) to fix. 
 
 Replace | 1 with an attribute "DEC". 
 
 Press (hIIp) for a different suggestion." 
 
 FILE = workspace PL/ 1 WORKSPACE (14- new) SPACE = 284 
 
 r • r r w r . r r v ' r r p ' v v v r v r r> 
 
 .A: PROC; 
 
 DCL.I .FIXED. . @ 
 
 ********** Possible Correction ********** (back) or [erbse) to fix. 
 Insert "," in front of | [ . 
 
 Press (help) for a d i f f erent suggest i on . 
 
 FILE = workspace PL/ 1 WORKSPACE (14- new) SPACE = 284 
 
 1 ' " r v r r ' w v ' r r r r r r v r v v r r 1 
 
 .A: PROC; 
 
 DCL.I .FIXED. . g 
 
 ********** Possible Correction ********** (back) or [erhse] to fix. 
 Insert ";" in lront of 1 | . 
 
 Press (help) for a different suggestion. 
 
129 
 
 FILE = workspace PL/I WORKSPACE (14-new) SPACE * 28 4 
 
 r r f r r.r r w r t r ' r ' » r ' r v r ■ r " ' " r 1 
 
 A: PROC; 
 
 DCL.I . riXECj . . .C 
 
 ********** Possible Correction ********** (back) or [erase] to fix. 
 RepUce_l_^_J with_\ ^ 
 
 Press (help) for a different suggestion. 
 
130 
 
 FILE = workspace PL/ 1 WORKSPACE (14- new) SPACE = 263 
 
 v v r r r ; w v r w v r v v v v v r r ' 1 
 
 B: PROC; 
 
 DCL.I, . A(. .Q 
 
 ********* *ERROR#********# (back) or (errse) to fi'x. 
 
 This punctuation symbol is not permitted here. 
 Press (help) for more information. 
 
 FILE = workspace PL/ 1 WORKSPACE (14- new) SPACE = 263 
 
 6: PROC; 
 
 DCL.I, . .A(. .0 
 
 ********** Possible Correction ********** (sack) or ( erase) to fix, 
 Insert an array bound in* front of 
 
 Press (help) for a different suggestion. 
 Press SQB^IkP) to examine an array bound, 
 
 FILE = workspace PL/ 1 WORKSPACE (14- new) SPACE =■ 283 
 
 " " ' ' ' r r r r ' y, v v v r r r r ' r r v - r r r ' i 
 
 B: PROC; 
 
 DCL.I, . A(. .Q 
 
 ********** Possible Correction ********** (back) or (erase) to fix. 
 Insert a numeric expression in front of . 
 
 Press (help) for a different suggestion. 
 
 Press h:h^ help) to examine a numeric express /on. 
 
131 
 
 FILE « workspace PL/I WORKSPfiCE U 4 -' new) SPACE » 283 
 
 • ' w r r r 1 "r. r rr r r r ' r r r r rr i> » 
 
 B: PROC; 
 
 DCL.I, . .fl(. .0 
 
 ********** Possible Correction ********** (back) or jjjjJE) to fix. 
 Insert a numeric operand* in front of | | . 
 
 Press (help) for a d i f f erent suggest i on . 
 Press mmiHELPl to examine a numeric operand* 
 
 FILE = wo rkspace PL/I UIORKSPRCE (1 4-r.ew) SPACE = 26 3 
 
 ' " v r v v • v r r '*> w v ' v ' r y r y r r ri 
 
 6: PROC; 
 
 DCL.I, . .A(. .0 
 
 ********** Possible Correction ********** (back) or (cruse) to fix. 
 Insert a declared variable ( numeric) in front of | | . 
 Press (help) for a different suggestion. 
 
 FILE = workspace PL/I WORKSPACE (14-new) SPRCE = 283 
 
 9 9 9 v ' v . w r v r v v v r y v v r ' r J=x= y 
 
 B: PROC; 
 
 DCL.I, . .A(. .0 
 
 ********** Possible Correction ********** (back) or (erhse) to fjx. 
 Insert a numeric constant in front of | 1 . 
 Press ^Kp) for a different suggestion. 
 
 FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPRCE = 28 3 
 
 r r r r ' y r r ■ r r r ^ ' m - r r r r r r i 
 
 B: PROC; 
 
 DCL.I, . .R(. .Q 
 
 ********** Possible Correction ********** (back) or (erbseJ to fix. 
 Replace | | with a declared array ( numeric) . 
 Press (help) for a d i f f erent suggest i on . 
 
132 
 
 FILE = workspace PL,' I WORKSPACE (1 4 -new) SPACE = 283 
 
 B: PROC; 
 
 DCL.I, . .AC .0 
 
 *♦♦»♦♦»*» » Pos s i b 1 e Correction ********** (back) or (erbse) to fix. 
 Replace | | with an arithmetic operator. 
 Press (help) for a different suggestion. 
 Press GUBhelp) to see a legal arithmetic operator. 
 
 FILE = workspace PL/I WORKSPACE (1 4 -new) SPACE « 283 
 r v r r " v , r r v r w r v v r~ v ~ r r r » 
 
 B: PROC; 
 
 DCL.I, . .AC .0 
 
 ********** Possible Correction ********** (back) or (erbse) to fj*. 
 
 Replace | | with an arithmetic operator " + ". 
 
 Press (help) for a d i f f erent suggest i on . 
 
 Press ffiiffiHELPl to see another legal arithmetic operator. 
 
 FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPACE » 283 
 
 B: PROC; 
 
 DCL.I, . AC .0 
 
 ********** Possible Correction ********** (back) or (errse) to fi>. 
 Replace | | with an ar it hmet ic operator "-". 
 Press (help) for a different suggestion. 
 
133 
 
 FILE ■ workspace PL/ 1 WORKSPACE (14- new) SPRCE - 26 3 
 
 ■■■'■ > r w ' ' r 11 ] r ' r ' r ' " r - t r " ' r - ' r ■ l ^^^ M *g^™T F r " ' r * " r* 
 
 B: PROC; 
 
 DCL.I, . R(. .0 
 
 »»»♦**♦»» » Pos sible Correction ********** [back] or (cruse) to fix. 
 
 Replace j | with " (". • 
 
 Press (help) for a different suggestion. 
 
 FILE = workspace PL/ 1 WORKSPACE (14- new) SPRCE = 283 
 
 ' r r 9 r r r r " r ' r r ' r ' ' r r y r ' r v r i 
 
 B: PROC; 
 
 DCL.I, . .R(. .0 
 
 ********** Possible Correction ********** (back) or (eriise) to fix. 
 Replace [ j with a numeric built-in function. 
 Press (help) for a d i f f erent suggest i on . 
 
 Press fliHiiHELPl to see a legal numeric built/ in function. 
 
 FILE = workspace PL/ 1 WORKSPACE ( 1 4 - ■new) SPRCE = 283 
 
 v v w v ?. * r r y Y r " r v v w • r r w m t 
 
 B: PROC; 
 
 DCL.I, . R(. .0 
 
 ********** Possible Correction ********** [back ) or (erhse) to fix. 
 Replace | | with a numeric built-in function "RBS". 
 Press (help) for a d i f f erent suggest i on . 
 Press Q2Ete§LP) to see another legal numeric/built-in function. 
 
134 
 
 FILE = workspace PL/ 1 WORKSPACE (1 4-new) 5PRCE » 283 
 
 1 - ' 7 r t r v . w • r ' r r ' r ' r "' r r r T r r ^ " i 
 
 B: PROC; 
 
 DCL.I, . R(. .0 
 
 ********** Possible Correction ********** (back) or [erhse) to fix. 
 Replace | | with a string built-in function. 
 Press (help) for a different suggestion. 
 Press GEBtHkEl to see a legal string built-rn function. 
 
 FILE = workspace PL/ 1 WORKSPACE (1 4 - new) SPACE = 28 3 
 
 • v '- r r — w ' v v r ' w v r v ' v v ' v r v r r j 
 
 B: PROC; 
 
 DCL.I, . R( Q 
 
 ********** Fossible Correction ********** (back) or (erhse) to fix. 
 Replace | | with a string built-in function "LENGTH". 
 
 Press (help) for a different suggestion. 
 
 Press HiHiiHELPl to see another legal string built-in function. 
 
135 
 
 FILE = workspace PL/ 1 WORKSPACE (14-nem) SPACE * 26 2 
 
 ' ' r r r r r j r ^'^T r r r"7 r r • i r r r p 1 i 
 
 C: PROC; 
 
 DCL.I, .A (10) DEC; 
 
 GET. LIST ( A ) ; 
 
 I.-.R.H 
 
 ***#******ERROR* ********* (ffick) or (erme) to fix. 
 
 This punctuation symbol is not permitted her.e. 
 Press (help) for more information. 
 
 FILE = workspace PL/I WORKSPACE (1 4-' new) SPACE = 2 62 
 
 f ' y ' r » 9 9 * > 9 > » > » r r r F 
 
 C: PROC; 
 
 DCL.I, A (10) DEC; 
 
 GET. LIST (.A ) ; 
 
 I. -.fi.fi] 
 
 ********** Possible Correction ********** (gfiCkl or (erhse] to fix. 
 Insert a subscript list in front of 
 
 Press IhIlp) for a different suggestion. 
 Press GEBhIlp) to examine a subscript list. 
 
 FILE = workspace PL/ 1 WORKSPACE (14- new) SPACE » 262 
 
 J * =ac== ■ ■ ■ ( r i r r r r } r ' r r r r ' t r ' r r " v • r v v i 
 
 C: PROC; 
 
 DCL.I, .A (10) DEC; 
 
 GET. LIST (A.) ; 
 
 I.-.R.0 
 
 ********** Possible Correction ********** [back) or (Ewiisi) to fix. 
 Replace | | with " (". 
 
 Press JHELP) for a d i f f erent suggest i on . 
 
136 
 
 FILE - w orkspace PL/ 1 WORKSPACE (14- new) SPACE = 262 
 
 y " * * r . * r " ' r r " r r ' y » r r r r — *— i 
 
 C: PROC; 
 
 DCL.I, .A (10) DEC; 
 
 GET. LIST ( A ) ; 
 
 i.-.g.; 
 
 *♦»»»»♦♦* * Pos sible Correction ********** (brck) or (t*m\ to fix. 
 Replace | | with a numeric expression. 
 Press Ihjelp) for a different suggestion. 
 
 FILE « workspace PL/ 1 WORKSPACE (1 4 -new) SPACE = 262 
 
 v v v w ' 7 , v r ' V v — v ' r v v y v v r r i 
 
 C: PROC; 
 
 DCL.I, .A (10) .DEC; 
 
 GET. LIST ( A.) ; 
 
 I. -a..s 
 
 ********** Possible Correction ********** (buck) or (raisi) to fix. 
 Replace [ 1 with a numeric operand. 
 Press (help] for a different suggestion. 
 
 FILE = workspace PL/I WORKSPACE (M -new) SPACE » 262 
 " • ' v v v r ' r / v r r r v r v ' r ' ' " r ' r y r ' r' y 
 
 C: PROC; 
 
 DCL.I, A (10) DEC; 
 
 GET. LIST ( A ) ; 
 
 I.-.0.; 
 
 ********** Possible Correction ********** (back) or (jlhrse) to fix. 
 
 Replace with a declared variable ( numeric) 
 
 Press ( help) for a dif f erent suggest i on . 
 
137 
 
 FILE ■ workspace PL/ 1 UJORKSPRCE (1 4 -new) SPRCE = 26 2 
 
 ^— = r v r r ' r , r " r " r " ^' r w w • r v y ' r r " r *"l 
 
 C: PROC; 
 
 DCL.I, .A (10) DEC; 
 
 GET. LIST (R.) ; 
 
 I.-.0.J 
 
 ********** Possible Correction ********** fgcjj) or (crwj?) to fix. 
 Replace | | with a numeric constant. 
 
138 
 
 FILE = wor kspace PL/I WORKS PACE (1 4 -new) SPACE = 264 
 
 .0: PROC; 
 
 DCL.I, ,A(10) ; 
 
 GET LIST. (A) ; 
 
 I-.ft(.fl 
 
 **********ERROR********** ©or@to fix. 
 
 This punctuation symbol _is not_ permitted here. 
 Press (help) for more information. 
 
 FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPACE = 26 4 
 
 r r v r v ; r r 'v v v r v v r v v r r — > 
 
 .D: PROC; 
 
 DCL.I, A(10) ; 
 
 GET LIST. (A) ; 
 
 I-.AC.0 
 
 ********** Possible Correction ********** (Back) or (JRJse) to fix, 
 Insert a numeric expression in front of •. 
 
 Press (help) for a different suggestion. 
 
 Press l-Ii^ HELP j to examine a numeric expression. 
 
139 
 
 FILE - workspace PL/I WORKSPACE ( 1 4 -' new) SPfiCE -26 4 
 
 r f r r r/ ' r 1 r ' V w ' w " w ' w ' r r r " ' r r r » 
 
 D: PROC; 
 
 DCL.I, R(10) ; 
 
 GET. LIST. (R) ; 
 
 I-.RC.0 
 
 ********** Possible Correction ******##*# [back] or (ermeI to fix. 
 Insert a numeric operand in front of | | . 
 
 Press (help) for a d i f f erent suggest i on . 
 Press SISJ(helpJ to examine a numeric operand. 
 
140 
 
 FILE = workspace PL/ 1 WORKSPACE (14- new) SPftCE = 262 
 
 '"' v y y v 7 , v v v v v v ' r y v v y y r =3== i 
 
 E: PROC; 
 
 DCL.I, .ft (10) ; 
 
 GET. LIST, (ft) ; 
 
 I-.AC.i . .0 
 
 ******* ***ERROR* w ******** (gflCK) or IIrhse) to fix. 
 
 This punctuation symbol is not permitted here. 
 Press [ help) for more i n format i on . 
 
 FILE = workspace PL/ 1 WORKSPACE (M-new) SPftCE = 26 2 
 
 ' v y y y r r r r r r " r r r ' y v y r r ' i 
 
 .E: PROC; 
 
 DCL.I, .ft (10) ; 
 
 GET. LIST, (ft) ; 
 
 I=.ft(.l .0 
 
 ********** Possible Correction ********** [back] or (erjjse) to fix. 
 Replace | | with an arithmetic operator. 
 Press (help) for a different suggestion. 
 
141 
 
 FILE « workspace PL/ 1 WORKSPACE (1 4 -new) 5PRCE » 262 
 
 ' r r r r r ^ r r ,m, w "" r " ' r m r ' ' r "T ' r ' r r r " " i 
 
 .E: PROC; 
 
 DCL.I, A (10) ; 
 
 GET. LIST. (A) ; 
 
 I-.RC.l . .0 
 
 ********** Possible Correction ********** (back) or Ictme) to fix. 
 Insert ") M in front of I I. 
 
142 
 
 FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPACE * 25 6 
 
 ■ i n y y r v v v r w w r y v — v v r v r r r 
 
 F: PROC; 
 
 DCL.I, A( 5, 10) .; 
 
 GET. LIST. (A) ; 
 
 I=.A. (.1 .0 
 
 ******* ♦♦•ERROR* ********* (sock) or (erhseI to fix. 
 
 This punctuation symbol is not permitted here._ 
 Press (help) for more i n format i on . 
 
 FILE = workspace PL/ 1 WORKSPACE ( 1 4 -new) SPACE * 258 
 
 1 1 r v w r r . \r w r w w 7 v r v r v r v v 
 
 F: PROC; 
 
 DCL.I, .A (.5, 10) .; 
 
 GET. LIST. (A) ; 
 
 i-.n. (.1 .Q 
 
 ********** Possible Correction ********** (back) or (ERJjffl to fix. 
 Replace | [ with an arithmetic operator. 
 Press (help] for a different suggestion. 
 
 FILE = workspace PL/ 1 WORKSPACE (M-new) SPACE = 25 8 
 
 F: PROC; 
 
 DCL.I, A( 5, 10) .; 
 
 GET. LIST. (A) ; 
 
 I* .A. (.1 .Q 
 
 ********** Possible Correction ********** (back ) or (erSse) to fix. 
 Replace | _ j with_" , " . 
 
 Press (help) for a different suggestion. 
 
143 
 
 FILE = uiorkspace PL/ 1 WORKSPACE (1 4-neui) SPftCE = 25 6 
 
 •r r r • r w m r r m " r [l r " r r ' r ' r ■ r ■ y ■ r r' r i 
 
 F: PROC; 
 
 DCL.I, .A (.5, 10) . ; 
 
 GET. LIST, (ft) ; 
 
 I-.ft. (.0.) 
 
 ********** Possible Correction ********** [back] or (raws?) to fix. 
 Insert " (" in front of PH - 
 
144 
 FILE - workspace PL/I WORKSPACE ( 1 4 - new) SPACE = 25 4 
 
 G: PROC; 
 
 DCL.AU0) ,D(10, 10) ,C; 
 
 DO. ,C=1 TO. 10; 
 
 fi(C) .- D.g 
 
 **********ERROR*v ******** (BACkj or (ERH5?) to fix. 
 
 Th i s arit rime t i c oper a t or is not perm i 1 1 ed her e . 
 Press Ihelp) for more information. 
 
 FILE - workspace PL/ 1 WORKSPACE (1 4 -new) SPACE = 254 
 
 G: PPOC; 
 
 DCL.AO0) ,D(10, 10) ,C; 
 
 DO. C=l TO. 10; 
 
 fl(C).vD.g 
 
 ********** Possible Correction ********** [back) or (rail!) to fix, 
 Insert a subscript list in front of 
 
 Press I.help) for a dif f erent suggest i on . 
 Press PTOji hELPl to examine a subscript list. 
 
 FILE = workspace PL/ I WORKSPACE (1 4 -new) SPACE = 25 4 
 
 7 r r v v r v r v ' v v v v v v v r 'nP 
 
 G: PROC; 
 
 DCL.A(10) ,D(10, 10) ,C; 
 
 DO. C=l TO. 10; 
 
 A(C) =.D0 
 
 ********** Possible Correction ********** (back) or (jreili) to fix. 
 RepUce j _ j with " (". 
 
 Press j help) for a different suggestion. 
 
 FILE = ujoHcspacg__ PL/I WORKSPACE (1 4- ne w) SPACE = 25 4 
 
 G: PROC; 
 
 DCL.A(10) ,D(10, 10) ,C; 
 
 DO. C=l TO. 10; 
 
 A(C) .= |.* 
 
 ♦♦»»»»*** » Pos sible Correction ********** {mck} or {Erhse) to fix. 
 
 R f p I a £ e J— -J with a numeric operand 
 
 Press IHELP) for a different suggestion. 
 
145 
 
 FILE - workspace PL/ 1 WORKSPACE (1 4- new) SPACE ~ 254 
 
 - -■ ■ r r r ' r . ' r r " r " ' v f " r* ' r r > r ■ r r~ r — > 
 
 G: PROC; 
 
 DCL.AU0) ,D(10, 10) ,C; 
 
 DO. C=l TO. 10; 
 
 A(C) .-.§.* 
 
 ****»****» Possible Correction ********** [back) or (erase) to fix. 
 Replace | J with a declared variable ( numeric) . 
 Press [help] for a d i f f erent suggest i on . 
 
 FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPACE » 25 4 
 
 " ' " "" ' ' r r r ' r t v w ' '> " f ' r " 9' f ' r ■ r v v r - r " r " 
 
 G: PROC; 
 
 DCL.AU0) ,D(10, 10) ,C; 
 
 DO. C=l TO. 10; 
 
 A(C) .-.§.* 
 
 ********** Possible Correction ********** [back] or (erase) to fix. 
 Replace | | with a numeric constant. 
 
146 
 
 FILE = workspace PL/ 1 WORKSPACE (1 4- new) SPACE = 2 76 
 
 v r r v w , v ' r r r y y r y r ? i > r r"™ 35 ! 
 
 H: PPOC; 
 
 DCL.J.CHRR(l) ; 
 
 DO..0 
 
 **********ERROR***#****** (back) or (erhIe) to fix, 
 
 This declared variable is not permitted here. 
 Press I .help) for more information. 
 
 FILE = workspace PL/ I WORKSPACE (14- new) SPACE = 276 
 
 7 v r v r~ v — r F' y'r v v v v — v v v • r i 
 
 H: PPOC; 
 
 DCL.J CHRR(l) ; 
 
 DO.. a 
 
 ********* * Pos s i b 1 e Correction »*****#**# (brckJ or [erase] to fix. 
 Replace | | with a declared variable ( numeric) . 
 Press [help) for a different suggestion. 
 
 FILE = workspace PL/ 1 WORKSPACE ( 1 4 -new) SPACE = 27 8 
 
 .H: PROC; 
 
 DCL.J.CHAR(l) ; 
 
 DO.. | 
 
 ********** Possible Correction ********** (back) or [erhse] to fix. 
 
 Replace [ J with a declared array ( numeric) . 
 
 Press (help) for a d i f f erent suggest i on . 
 
 FILE = workspace PL/I WORKSPACE (1 4 -new) SPRCE = 278 
 
 r ' r 
 
 H: PROC; 
 
 DCL.J.CHRR(l) ; 
 
 DO.. a 
 
 ********** Possible Correction ********** (back) or [erase] to fix, 
 
 Replace with "WHILE" 
 
 Press I .help) for a d i f f erent suggest i on . 
 
 
147 
 
 FILE « workspace PL/ 1 WORKSPACE (1 4 -new) SPACE - 278 
 
 »^— — " r r r r r \ r 'r ' r ' r rr'r r r~ r r r r r 
 
 .H: PROC; 
 
 DCL.J.CHAR(l) ; 
 
 DO . .0 
 
 ********** Possible Correction ********** (back) or [eruse] to fix. 
 Insert ";" in front of | | . 
 
 Press [help) for a d i f f erent suggest i on . 
 
 FILE* workspace PL/I WORKSPACE (M-new) SPACE ■ 278 
 
 •r r f r ,, f /r r , r ' , r w • w r r ' r ' r F r r ~~r 
 
 .H: PROC; 
 
 DCL.J.CHAR(l) ; 
 
 J 
 
 ********** Possible Correction ********** [back] or (erhse) to fi>. 
 Replace | | with a statement. 
 
148 
 
 FTI F worksMC- PL/ 1 WOR KSPACE (M-neuO SPftCE = 26 7 
 
 h ILL - W'..<1 f-apa^- ' 
 
 .1: PROC; 
 
 DCL.fi, B; 
 
 . GET, LIST ( ft, B) ; 
 IF. .R+B. .[THEM 
 
 **********ERROR********** ® or S to f1 *' 
 This reserved word i s jiot jperm i t_ted _bere . _ 
 
 Press I help ) for more i n for mat i on . 
 
 FILE . ■.■^^ce PL/I UORKSPflG E (14-reui) SPfiCE - 267 
 
 I: PROC; 
 
 . .DCL.ft,B; 
 GET .LIST ( ft, B) ; 
 IF ft+B.filHENl 
 
 ********** Possible Correction ********** @ ^ ® to fl *' 
 
 Replace F~1 with an ar^thmet^c_operator. 
 
 ~Press~|H^|~ f or a different suggestion. 
 
 PL/I ■■■r^cpftrF f 1 4-r,ew) SP*CE=_267_ 
 
 -v r 
 
 I: PROC; 
 
 DCL.fi.B; 
 
 GET LIST (ft, B) ; 
 
 IF ft+B . [THEM 
 
 d «ihl* Correction ********** (bacE> or I^D to fix. 
 ********* Possible Lorrecxwn 
 
 Replace \ZJ wl l h _ a relational operator ._ _ _ 
 
 "Press'l^p) f^r a different suggestion. 
 
149 
 
 FILE ■= workspace PL/I WORKSPACE (1 4 -new) SPftCE - 26 7 
 
 ^— ^^■-■- ■ r r r r ' r " r ■ r ■■ ■ f ■ p r - ■ r - ?■■■?■ ■ r •■■■ r " r t" ■' r 1 ' 'i 
 
 I: PROC; 
 
 DCL.fl.B; 
 
 GET. LIST (fl,B) ; 
 
 IF. .ft££. THEN 
 
 ********** Possible Correction ********** (Sack) or (wise) to fix. 
 Insert a relational operator in front of | | . 
 Press (Hltp) for a different suggestion. 
 
 FILE * workspace PL/I WORKSPACE (1 4- new) SPfiCE = 267 
 
 .1: PROC; 
 
 DCL.A,B; 
 
 GET. LIST (R,B) ; 
 
 IF. .g+§. .THEN 
 
 ********** Possible Correction ********** (back) or (twisE) to fix. 
 Replace | | with a conditional expression. 
 
150 
 
 FILE = workspace PL/ 1 WORKSPACE (14 -new) SFRCE = 275 
 
 J: PROC; 
 
 DCL.I; 
 
 DO. 1 = 1 .TO. 10. (UNTIL 
 
 **********ERROR********** (back) or (ebbse) to fix. 
 
 This undeclared variable is not permitted here. 
 Press (help) for more information. 
 
 FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SFRCE = 2 7 5 
 
 v v r v ' r m r r " r r w v v v v v ir r r ' 
 
 .J: PROC; 
 
 DCL.I; 
 
 .DO. 1=1 .TO. 10, [UNTIL 
 
 ********** Possible Correction ********** (back) or (erhse) to fix, 
 Replace | | with an arithmetic operator. 
 Press Ihelp) for a d i f f erent suggest i on . 
 
 FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPACE = 275 
 
 J: PROC; 
 
 DCL.I; 
 
 .DO. 1=1 ,TO,10, pNTIL 
 
 ********** Possible Correction ********** (bock) or fSra) to fix. 
 Replace J _ j with__"BYV_ 
 
 Press Ihelp) for a different suggestion. 
 
151 
 
 FILE •= workspace PL/ 1 UJORKSPRCE ( 1 4 - new) 5PRCE = 2 75 
 
 ■ ■ ' ■ r r * < * ' ' r ' r r r r r r r r r"r 'r r ■■ /' r 
 
 J: PROC; 
 
 DCL.I; 
 
 .DO. 1=1 ,TO.10. pNTIL 
 
 ********** Possible Correction ********** (sack) or [erbse) to fix. 
 Repl_ace_[^^ w i th " UHILE " L 
 
 Press (help) for a different suggestion. 
 
 FILE = workspace PL/ 1 UORKSPRCE (1 4 -new) SPRCE = 27 5 
 
 J: PROC; 
 
 DCL.I; 
 
 DO. 1=1 .TO.10. |L)NTIL) 
 
 ********** Possible Correction ********** (back) or [errseJ to fix. 
 Insert " ; " in front o f | | . 
 
152 
 
 FILE = workspace PL/ 1 WORKSPACE (1 4-'new) SPRCE = 283 
 
 ' y r r r ' r . r ' r r r r ' r ' r r r ' r i> r r ' " i 
 
 K: PROC; 
 
 .DCL.I; 
 
 FOR U 
 
 **** ******ERROR* ********* • (Bflck) or [ebhje] to fix. 
 
 Th^is^ declared variable is not permitted here. 
 Press |hEp| for more information. 
 
 FILE = workspace PL/I WO RKSPACE (1 4 -new) SPRCE = 283 
 
 .K: PROC; 
 
 DCL.I; 
 
 FOR .0 
 
 ********* Possible Correction ********** (££<*) or Ierme) to fi- 
 Insert u j " in front of | | . 
 
 Press (help) for a~d7f ferent~sJgies"t7on. 
 
 FILE = workspace PL/ 1 UQRKSPRCE (14- new) SPRCE = 283 
 
 1-1 v v r r ' 7 \ 'r 1 r ' 'r r r r r y ^ y r ^ r « 
 
 K: PROC; 
 
 DCL.I; 
 
 E2I-I 
 
 ********** Possible Correction ********** (back) or (erike) to fix. 
 Replace | | with a statement. 
 
 Press (help) for a d i f f erent suggest i on . 
 
 FILE * workspace PL/I UJORKSPRCE (1 4-newJ SPRCE = 2a3 
 
 111 r 7 r w v . y r v v r w v r v r w r r :== i 
 
 K: PROC; 
 
 DCL.I; 
 
 For]. i 
 
 ********** Possible Correction ********** (brck) or (eribe) to fix. 
 Replace [ J with a reserved word "IF". 
 Press (help) for a different suggestion. 
 
153 
 
 FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPFtCE = 28 3 
 
 1 / — r r ' v • r " r w r ~ w ' r r " r rr r f r r == r z 
 
 K: PROC; 
 
 DCL.I; 
 
 OEM 
 
 ********** Possible Correction ********** (ggck) or [ehrse] to fix. 
 Replace | J with a reserved word "DO". 
 Press IhEiP) for a different suggestion. 
 
 FILE * workspace PL/ 1 WORKSPACE ( 1 4 -new) SPACE - 283 
 
 = f r r f f. ' r r r r r r r r r , r r r 
 
 .K: PROC; 
 
 DCL.I; 
 
 Fo^.i' 
 
 ******** ** Po ssible Correction ********** (beck) or (erSse) to fix. 
 Remove | | from the program. 
 
154 
 
 FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPACE = 27 7 
 
 T 7 7 7~ 
 
 L: PROC; 
 
 DCL.I .FIXED; 
 
 ****** ****ERROR***# ****** • (back) or (erase) to fix. 
 
 This reserved word is not permitted here. 
 Press (help) for more i n format i on . 
 
 FILE = workspace P L/ 1 WORKSPACE (14- new) 5FACE = 277 
 
 * " " 7 7 7 7 V _ V 7 7 7 7 7 7 7 7~ V V 7 7 1 
 
 .L: PROC; 
 
 DCL.I FIXED; 
 
 .1 .= .1 
 
 ********** Possible Correction ********** (back) or (erbse) to fix. 
 Replace | | with an arithmetic operator. 
 Press (help) for a different suggestion. 
 
 FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPACE » 27 7 
 
 V 7 7 7 Fj 7 7 7 9 7 7 7 7 7 7 7 7 f 7~ 
 
 L: . . PROC; 
 
 DCL.I .FIXED; 
 
 ... .1=1 (TC 
 
 ********** Possible Correction ********** (back) or (erase) to fix. 
 Rep lace _ P~n with "; M . 
 
 Press (help) for a different suggestion. 
 
155 
 
 FILE * workspace PL/ 1 WORKSPACE (1 4 -new) SPACE - 27 7 
 
 r r ' r r ■ r . r r .. ■ y .. ■ r ■ r . r ■ r r r . r r r ■■ ^» » , 
 
 L: PROC; 
 
 DCL.I.FIXED; 
 
 |.«.l .TO 
 
 ********** Possible Correction ********** (pack) or [erhse] to fix. 
 Insert a reserved word "DO" in front of | | . 
 
156 
 
 FILE « workspace PL/I WORKSPACE (1 4 -new) SPACE * 281 
 
 " -■■ - r — v r r r , r r ' r r » r 1 r > r r r r = y f=ss \ 
 
 M: PROC; 
 
 DCL.I .FIXED; 
 
 DCL Q 
 
 ♦ *****#*#*ERROR** ******** • (BflciF) or (erbse) to fix. 
 
 This entry name is not permitted here. 
 Press [help) for more information. 
 
 FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPACE = 281 
 
 ■r v r v r ; v r r r w v " v v ] w v •• r w * r~ 
 
 .M: PROC; 
 
 DCL.I .FIXED; 
 
 DCL 
 
 ********** Poss i b 1 e Correct i on ********** (back) or (erase) to fix. 
 Replace | | with an identifier declaration. 
 
 Press [help) for a d i f f erent suggest i on . 
 
 Press umi lHELP) to exam i ne an i dent i f i er dec 1 arat i on . 
 
 FILE = workspace PL/I WORKSPACE (1 4- new) SPACE - 281 
 ' " ' v r w r' ' r\ r r ' r r w w w r r ' v w r =sx i^^^ 
 
 M: PROC; 
 
 DCL.I .FIXED; 
 
 DCL .0 
 
 ********** Possible Correction ********** (back) or (erbse) to fix. 
 Rep lace _L_J with " (". 
 
 Press [help) for a d i f f erent suggest i on . 
 
157 
 
 FILE - workspace PL/ 1 WORKSPACE (14- neui) SPACE =» 281 
 
 — •■ - r w r " ■ w " • v " r 'r ' ' y ' r r " r " • r ' • r' ■ y ' y ' r r >» 1 
 
 M: PROC; 
 
 DCL.I FIXED; 
 
 DCL Q 
 
 ********** Possible Correction ********** (back) or fggj| to fix. 
 Replace [ J with an undeclared variable. 
 Press (help) for a different suggestion. 
 
 FILE * workspace PL/ 1 WORKSPACE ( 1 4 -new) SPACE = 281 
 
 > k "r r r.r r r> p r r w y 1 r w w r f 
 
 .M: PROC; 
 
 DCL.I .FIXED; 
 
 pcg.M 
 
 ♦ ♦♦»*♦»♦» » Pos sible Correction ********** (Sack) or Pbse) to fix. 
 Replace 1 | with "END". 
 
158 
 
 FILE -workspace PL/ 1 WORKSPACE (14- new) SPACE = 279 
 
 'r r r w v . r r w w w w v r r ' r c r r r 
 
 N: PROC; 
 
 DCL.I,fl; 
 
 PUT. LIST .0 
 
 ***** *****ERROR* ********* . dSck) or (erwse) to fix. 
 
 This punctuation symbol is not permitted here. 
 Press Ihelp) for more information. 
 
 FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPACE = 2 79 
 
 1 r r r r " » . ' y ' r r r i t ■ r ' r r y f r r r r - 
 
 .N: PROC; 
 
 DCL.I,A; 
 
 PUT. LIST .0 
 
 ********** Possible Correction ********** (back) or (erase) to ffx. 
 RepUce_ r3~] with " ("\ 
 
 Press (help) for a different suggestion. 
 
 FILE = workspace 
 
 PL/ 1 WORKSPACE ( 1 4 - new) 
 
 N: PROC; 
 
 DCL.I,A; 
 
 SPACE =279 
 
 r ' r ' r r w • v w " r~ v v v r r r 
 
 PUT pST] .; 
 
 ********** Possible Correction ********** [back] or (erbse) to fix. 
 Replace P~l with "SKIP". 
 
 Press (help) for a d i f f erent suggest i on . 
 
159 
 
 FILE « workspace PL/I WORKSPACE (M-new) SPACE = 279 
 
 t ~^— ^ " r r r f ■ » ■ r >" ~ r r " » ' ' y y " ' r ' ^ ^ . r - ■ ■ r ■ ■ r ■■ r p p | 
 
 N: PROC; 
 
 DCL I,R; 
 
 PUT pST] , ; 
 
 ********** Possible Correction ********** [back] or [erhse] to fix. 
 Replace | | with H Pf=lGE". 
 
160 
 
 FILE = workspace PL/I UIORKSPRCE (1 4-new) SPACE = 27 4 
 
 7 == r 7 7 v~ m v r v 7 7 7 w r r r r r r v 
 
 .0: . . PROC; 
 
 DCL.I.J; 
 
 PUT LIST . ( . ' START OF .PROGRAM ' , .0 
 
 ***** *****ERR0R* ********* • (back) or (erhse) to fix. 
 
 This punctuation symbol is not permitted here. 
 Press [help] for more i n format i on . 
 
 FILE - workspace PL/ 1 WORKSPACE (1 4-new) SPACE * 27 4 
 
 " - '■ ■ ■ v v r f r . v ' v v w v r r v \r v v w r ' ' r 
 
 .0: PROC; 
 
 DCL.I,J; 
 
 PUT .LIST . ( . ' START .OF .PROGRAM ' , .0 
 
 ********** Possible Correction ********** [back) or (erjise) to fix. 
 Insert a declared array in front of | | . 
 Press [help) for a different suggestion. 
 
 FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPACE « 2 74 
 
 ■ ■''■' • y~" v 7 r ' r v~ r v w 7 v 7 r v 7 v r r' v 
 
 .0: PROC; 
 
 DCL.I,J; 
 
 PUT LIST . ( . 'START OF PROGRAM' , .0 
 
 ********** Possible Correction ********** (back) or (ERJii) to fix. 
 Insert an expression in front of | | . 
 
 Press (help) for a d i f f erent suggest i on . 
 Press n:>'Ii'iHELPl to examine an expression. 
 
161 
 
 FILE = workspace PL/ 1 WORKSPACE ( 14 - -new) SPACE = 27 4 
 
 "^ — r ' r '?• — p | p ■■ p ■ y " ■ r " r " r~ F " r ' r r " r - r a= *y M= *T 
 
 .0: PROC; 
 
 DCL.I.J; 
 
 PUT. LIST. (. 'START. OF. PROGRAM' Q.) 
 
 ********** Possible Correction ********** (back] or [erhse] to fix. 
 Remove QH from the program. 
 
162 
 
 FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPACE ■ 2 72 
 
 • 7 v r v v r rr r r r r <r r <r v r w " r 
 
 P: PROC; 
 
 DCL.A,B,I; 
 
 GET. LIST. (A, .g 
 
 **********ERROR********** • (back) or (erase) to fix. 
 
 This entry name is not permitted here. 
 Press (help) for more i n format l on . 
 
 FILE = workspace PL/I WORKSPACE (M-new) SPACE = 27 2 
 
 .P: PROC; 
 
 DCL.A,B,I; 
 
 GET. LIST. (.A, g 
 
 ********** Possible Correction ********** (back) or (erase) to fix. 
 Replace | | with a declared variable. 
 Press I help) for a different suggestion. 
 
163 
 
 FILE - workspace PL/ 1 WORKSPACE (14- new) SPftCE « 272 
 
 — — ' ' r r r ' r " • r j w "r ' - y P r " r ' r ' r •■ r - w v r r i 
 
 P: PROC; 
 
 DCL.fl,B,I; 
 
 GET. LIST. (.A, | 
 
 *♦♦«♦♦*♦♦ ♦ Pqs s i b 1 e Correction ********** (back) or [eruseJ to fix. 
 Replace j 1 with a declared array. 
 
164 
 
 FILE = workspace 
 
 PL/ 1 WORKSPACE (14- new) 
 
 SPACE = 26 7 
 
 IT: . . . PPOC; 
 
 DCL.A; 
 
 GET. LIST (A) ; 
 
 PUT. LIST (A) ; 
 
 STOP IT 
 
 ********* *ERROR* ********* 
 
 (BflCy or (ERH5E) to fix. 
 
 This entry name is not permitted here. 
 Press (help) for more information. 
 
 FILE = workspace 
 
 PL/ 1 WORKSPACE (14- new) 
 
 SPACE = 267 
 
 IT: PROC; 
 
 DCL.A; 
 
 GET. LIST (A) ; 
 
 PUT. LIST (A) ; 
 
 STOP. IT 
 
 ********** Possible Correction ********** (bbck) or (erbD to fix. 
 
 Replace | | with ";". 
 
 Press (help) for a dif f er ent suggest i on . 
 
165 
 
 FILE « workspace PL/ 1 WORKSPACE (1 4 -new) SPACE » 26 7 
 
 ,_,,., — — | r r y , , r ,. w . , r . y . , , r r , F , r r r " r r t — 
 
 IT: PROC; 
 
 DCL.Aj 
 
 get. list (a) ; 
 
 put l ist (a) ; 
 
 EsTop1.it 
 
 ********** Possible Correction ********** (back) or [erhse) to fix. 
 Replace P | with "END". 
 
166 
 
 VITA 
 
 Michael Tindall was born in Seattle, Washington on 
 September 20, 1949. He graduated from Seattle Pacific College 
 with a B.S. in mathematics in June, 1971, and was elected to 
 "Who's Who in America Among Students" in that year. During four 
 years of graduate school work in computer science at the Univer- 
 sity of Illinois at Urbana-Champaign, he received the Master of 
 Science degree (1975) as well as serving as the coordinator/ 
 secretary of the Computer Science Graduate Student Organization 
 and sitting on numerous departmental committees. He was elected 
 to the Society of Scientific Researchers, SIGMA XI, in 1973, and 
 to the National Scholastic Honorary, Phi Kappa Phi, in 1974. He 
 is a member of the Association of Computing Machinery. 
 
IBLIOGRAPHIC DATA 
 <EET 
 
 1. Report No. 
 
 UIUCDCS-R -75-7^8 
 
 "Itlr and Subtitle 
 
 An Interactive Compile -time Diagnostic System 
 
 3. Recipient's Accession No. 
 
 5. Report Date 
 
 October 1975 
 
 6. 
 
 Author(s) 
 
 Michael Harry Tindall 
 
 8. Performing Organization Rept. 
 
 No- UIUCDCS-R-75-7 1 +8 
 
 Performing Organization Name and Address 
 
 University of Illinois at Urbana -Champaign 
 Department of Computer Science 
 Urbana, Illinois 61801 
 
 10. Project/Task/Work Unit No. 
 
 11. Contract /Grant No. 
 
 Sponsoring Organization Name and Address 
 
 IBM Corporation 
 
 Thomas J. Watson Research Laboratory 
 
 Yorktown Heights, New York 10598 
 
 13. Type of Report & Period 
 Covered 
 
 Doctoral Thesis 
 
 14. 
 
 Supplementary Notes 
 
 Abstracts 
 
 The following interactive diagnostic system can be devised to analyze syntax 
 errors detected in an interactive timesharing compiling environment. The system 
 is "automatic", that is, it is driven by the compiler's parser tables. The 
 error system behaves like a consultant by suggesting "possible corrections" of 
 the program to the user, and at any time the user can proceed to fix the program 
 or request further suggestions. In addition, the "possible correction" 
 diagnostic suggestions can refer to not only individual "tokens" in the user's 
 program, but also higher-level constructs, such as "expressions", "array bounds", 
 etc. 
 
 Key Words and Document Analysis. 17a. Descriptors 
 
 Compilers, error analysis, interactive compiling, parsers, transition diagrams 
 
 k. Identifiers /Open-Ended Te 
 
 c COSATI Field/Group 
 
 •Availability Statement 
 
 Release Unlimited 
 
 R M NTIS-35 (10-70) 
 
 19. Security Class (This 
 Report) 
 
 - . UNCLASSIFIED, 
 
 20. Security Class (This 
 
 Page 
 
 UNCLASSIFIED 
 
 21. No. of Pages 
 
 173 
 
 22. Price 
 
 USCOMM-DC 40329-P71 
 
INSTRUCTIONS FOR COMPLETING FORM NTIS-35 (10-70) (Bibliographic Data Sheet based on COSATI 
 Guidelines to Format Standards for Scientific and Technical Reports Prepared by or for the Federal Government, 
 PB-180 600). 
 
 1 Report Numbtr. Each report shall carry a unique alphanumeric designation. Select one of the following types: (a) alpha- 
 numeric designation provided by the sponsoring agency, e.g., F AA-RD-68-09; or, if none has been assigned, (b) alphanu- 
 meric designation established by the performing organization e.g., FASEB-NS-87; or, if none has been established, (c) 
 alphanumeric designation derived from contract or grant number, e.g., PH-43-64-932-4. 
 
 2 Leave blank. 
 
 3. Recipient's Accession Number. Reserved for use by each report recipient. 
 
 4. Title ond SubtitU. Title should indicate clearly and briefly the subject coverage of the report, and be displayed promi- 
 nently. Set subtitle, if used, in smaller type or otherwise subordinate it to main title. When a report is prepared in more 
 than one volume, repeat the primary title, add volume number and include subtitle for the specific volume. 
 
 5. Report Dote. Each report shall carry a date indicating at least month and year. Indicate the basis on which it was selected 
 
 I ., ditc of issue, date of approval, date or preparation. 
 
 6. Performing Organization Code. Leave blank. 
 
 7. Author(s). Give name(s) in conventional order (e.g., John R. Doe, or J.Robert Doe). List author's affiliation if it differs 
 from the performing organization. 
 
 8. Performing Organization Report Number. Insert if performing organization wishes to assign this number. 
 
 9. Performing Organization Name and Address. Give name, street, city, state, and zip code. List no more than two levels of 
 an organizational hierarchy. Display the name of the organization exactly as it should appear in Government indexes such 
 as USGRDR-I. 
 
 10. Project/Task/Work Unit Number. Use the pro)ect, task and work unit numbers under which the report was prepared. 
 
 11. Controct/Gront Number. Insert contract or grant number under which report was prepared. 
 
 12. Sponsoring Agency Name and Address. Include zip code. 
 
 13. Type of Report and Period Covered. Indicate interim, final, etc., and, if applicable, dates covered. 
 
 14. Sponsoring Agency Code. Leave blank. 
 
 15. Supplementary Notes. Enter information not included elsewhere but useful, such as: Prepared in cooperation with... 
 Translation of . . . Presented at conference of . . . To be published in . . . Supersedes . . . Supplements . . . 
 
 16. Abstract. Include a brief (200 words or less) factual summary of the most significant information contained in the repori 
 It the report contains a significant bibliography or literature survey, mention it here. 
 
 17. Key Words and Document Analysis, (a). Descriptors. Select from the Thesaurus of Engineering and Scientific Terms the 
 proper authorized terms that identify the major concept of the research and are sufficiently specific and precise to be u 
 as index entries for cataloging. 
 
 (b). Identifiers and Open-Ended Terms. Use identifiers for project names, code names, equipment designators, etc. 
 open-ended terms written in descriptor form for those subjects for which no descriptor exists. 
 
 (c). COSATI Field Group. Field and Group assignments are to be taken from the 1965 COSATI Subject Category 1 
 Since the majority of documents are multid isc ip linary in nature, the primary Field/Group assignments ) will be the spec 
 
 iscipline, area of human endeavor, or type of physical object. The applicat ion(s) will be cross-referenced with sect 
 Field/Group assignments that will follow the primary posting(s). 
 
 18. Distribution Statement. Denote reusability to the public or limitat ion for reasons other than security for example "R 
 lease unlimited". Cite any availability to the public, with address and price. 
 
 19 & 20. Security Classificotion. Do not submit classified reports to the National Technical Information Service. 
 
 21. Number of Poges. Insert the total number of pages, including this one and unnumbered pages, but excluding distnbut 
 1 i s r , if any. 
 
 22. Price. Insert the price set by the National Technical Information Service or the Government Printing Office, if known. 
 
 USCOMM-UC «"»' 
 
 ' 
 
 FORM NTIS-3S (10-70) 
 
^ 
 
 * 
 
 «$ 
 
•r- 
 
 ftl