■ LIBRARY OF THE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN no 510.84 .74G-75I cop. 2 The person charging this material is re- sponsible for its return to the library from which it was withdrawn on or before the Latest Date stamped below. Theft, mutilation, and underlining of books are reasons for disciplinary action and may result in dismissal from the University. UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN 9Ut 1 3 iof| 1 2 RECTI L161 — O-1096 Digitized by the Internet Archive in 2013 http://archive.org/details/interactivecompi748tind Tlt*> ]U>- ^/^UIUCDCS-R -75-7^8 ;/ ! tcUi IHl LIBRARY 01 ["HE SEP 8 1975 UNIVERSITY OF ILLINOIS AN INTERACTIVE COMPILE -TIME DIAGNOSTIC SYSTEM BY Michael H. Tindall Report No. UIUCDCS-R-75-7 1 +8 AN INTERACTIVE COMPILE-TIME DIAGNOSTIC SYSTEM* BY MICHAEL H. TINDALL October 1975 Department of Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois 61801 • This work was supported in part by an International Business Machine Corporation educational fellowship and was submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science, October 1975. ABSTRACT Let the following be a few of the characteristics of an interactive compiler system: the user's program is interactively entered via a keyboard and display-screen timesharing terminal (e.g., the PLATO IV CAI system); the program is scanned and syntac- tically parsed as it is input by the user; programming syntax errors are signalled as soon as detected by the parser, and the user must correct the errors before proceeding (this implies that no right context of the user program is available for examination by the error analysis system). Under these assumptions, the following interactive diag- nostic system can be devised. The system is "automatic," that is, it is driven by the compiler's parser tables. The error system be- haves like a consultant by suggesting "possible corrections" of the program to the user, and at any time the user can proceed to fix the program or request further suggestions. In addition, the "possible correction" diagnostic suggestions can refer to not only individual "tokens" in the user's program, but also higher-level constructs, such as "expressions," "array bounds," etc. Key Phrases: Compilers, error analysis, interactive com- piling, parsers, transition diagrams. m ACKNOWLEDGEMENTS The author wishes to express his gratitude to his thesis advisor, Professor Thomas Wilcox, for his many suggestions and critical guidance throughout the work on this thesis. Thanks are also in order for Professor Alan Davis and J. Mike Milner for their assistance with the implementation of the prototype diagnostic system on PLATO IV. A sincere thank you is also given to the other members of the thesis committee: Professors J. Nievergelt, H. G. Friedman, M. D. Mickunas, and E. M. Reingold for their support. Professor Mickunas in particular, as well as Professor W. Hansen are thanked for their helpful comments about this thesis. Finally, my wife, Elaine, must be thanked; it was only through her patience and support that this work was completed. IV TABLE OF CONTENTS Chapter Page 1 INTRODUCTION 1 1.1 The Problem 1 1.2 The Compiling Environment 2 1.3 Compile- time Error System Constraints and Goals 5 1.4 Organization of the Thesis 8 2 SURVEY OF ERROR CORRECTION AND RECOVERY TECHNIQUES 10 3 THE TRANSITION DIAGRAM PARSING MODEL 20 3.1 Introduction 20 3.2 General Characteristics of the Transition Diagram Model 20 3.3 Formal Description of the Transition Diagram Model 29 3.4 Description of the Implemented Parser. . . 30 3.5 Compiler Environment when Parser Signals Error 32 4 A MODEL FOR DIAGNOSTIC INTERACTION WITH THE PROGRAMMER 34 4.1 Goals for Diagnostic Interaction .... 34 4.2 Suggesting "Possible Corrections" for the Programmer 35 4.3 Algorithmic Notation 42 4.4 Algorithm for Diagnostic Interaction ... 42 5 ' DETERMINING "POSSIBLE CORRECTION" SUGGESTIONS. . 44 5.1 Introduction 44 5.2 A First Model for Determining "Possible Correction" Suggestions .... 44 5.3 Extended Suggestion Model 63 5.4 Overall Control of the Extended Model. . . 88 6 CONCLUSIONS AND FUTURE RESEARCH 93 6.1 Summary 93 TABLE OF CONTENTS (Continued) Chapter Page 6.2 Evaluation 95 6.3 Future Research 101 6.4 Conclusion 102 LIST OF REFERENCES 103 APPENDIX I GENERAL OPTION TREE EXAMPLES 106 II EXAMPLES OF "POSSIBLE CORRECTION" SUGGESTIONS FOR PL/I 125 VITA 166 VI LIST OF TABLES Table Page 5.1 Token Options for States of Transition Diagram El 59 5.2 Error Interpretations for Transition Diagram El 60 5.3 Token Options for States of Transition Diagrams E2 and E3 61 5.4 Error Interpretations for Transition Diagrams E2 and E3 62 5.5 General Options for States of Transition Diagrams E2 and E3 81 5.6 Extended Error Interpretations for Transition Diagrams E2 and E3 82 5.7 General Options for Diagrams E6 - E9 86 5.8 Some Extended Error Interpretations for Diagrams E6 - E9 87 vn LIST OF FIGURES Figure Page 3.1 - Simple transition diagram example 23 3.2 - PL/I "GOTO" statement transition diagram ... 23 3.3 - Invoking transition diagram example 25 3.4 - PL/I "IF" statement transition diagram .... 26 3.5 - Multiple return transition diagram example. . . 27 3.6 - PL/I conditional expression example ..... 28 3.7 - PL/I transition diagram 29 4.1 (a)-Example of suggestion messages 36 (b)-Example 37 (c)- Example 37 (d)-Example 38 4.2(a)-"Detail" suggestion messages example 39 (b)-"Detail" example 40 (c)-"Detail" example 40 (d)-"Detail" example 41 (e)-"Detail" example 41 4.3 - Algorithm ' USE ^INTERACTION' 43 5.1 - The "transit" function . 52 5.2 - "Token" MODIFY routine 57 5.3 - REPARSE routine 58 5.4 - MODIFY routine, version 2 69 5.5 - Final USER INTERACTION routine 75 VI 11 LIST OF FIGURES (Continued) Figure Page 5.6 - Final REPARSE routine 76 5.7 - Final MODIFY routine 78 5.8 - Final MODJTREE routine 80 5.9 - Transition diagrams E6 - E9 84 5.10 - ERROR SYSTEM CONTROL routine 92 1 Chapter 1 INTRODUCTION 1 .1 The Problem Novice computer programmers, such as students taking an elementary computer programming course, spend a substantial frac- tion of the total time required to develop a computer program removing compiler-detected syntactic errors. The compilers used by these elementary programmers must provide informative and under- standable diagnostic messages when compile-time errors are discovered. At the present time, there are a few compilers available that operate in a batch-mode environment and provide reasonably good diagnostics (for example, PL/C (Conway and Wilcox [Con., '73]) and WATFIV (Cress, et al. [Cre., '70]). However, for a compiler that operates in an interactive timesharing environment, that is, an incremental compiler, the diagnostic assistance that can be pro- vided to an elementary programmer can be much more informative than is possible in a batch-oriented compiler. It is possible for an interactive diagnostic system to provide various "levels" of error messages about a particular error situation; some messages may sug- gest fairly general ways of fixing the program, while others provide very explicit detail about possible corrections of the error. Using the fact that the programmer is preparing the pro- gram via an interactive computer terminal, the diagnostic system can present error messages to the programmer and, based on the pro- grammer's response, tailor subsequent, more detailed or refined error messages to the comprehension level of the particular programmer. This thesis describes a syntactic error diagnostic system that is intended for use in a highly interactive, syntax table- driven compiler system. The primary goal of this diagnostic sys- tem is to provide good diagnostic messages in a form and terminol- ogy that an elementary programmer can understand. It is assumed that the novice programmer is compiling the program on an interac- tive timesharing computer and compiler; therefore, the programmer can, to some extent, control the diagnostic messages that are pre- pared and displayed. In addition, the error analysis algorithm makes complete use of the syntax tables, and thus any error analy- sis is performed "automatically" by the compiler system with very little additional work required of the language designer beyond specifying the syntax table itself. 1.2 The Compiling Environment Recently, a project at the University of Illinois at Urbana-Champaign has been under way to automate the teaching of the basic computer science courses by using the PLATO IV computer-based education system [Nie., '74]. The curriculum that has been imple- mented teaches elementary computer science programming language concepts and constructs to classes of students with highly-varied backgrounds. Specific detail on a variety of languages (e.g., PL/I, FORTRAN IV, COBOL, BASIC) is available; students progress at their own speeds through a fairly flexible course structure [Ela., '75]. An important part of the system is an on-line interactive compiler in which a student can easily and conveniently try out new programming language constructs immediately after learning about them in an instructional lesson [Wil., '73]. A number of design criteria for a compiler system emerge from examining this interactive environment. First, the compiler should be as interactive as possible to utilize the PLATO IV system effectively and to maintain a desirable computer-aided instruction- al environment for the student. To accomplish this, the compiler compiles character-by-character; that is, each single key pressed by the student using the compiler is examined immediately as the student types it in. Thus, the compiler keeps up completely with the student's preparation of a program. Another result of this immediate, incremental compiling technique is that syntax errors are detected and signalled as soon as possible by the system. The student must correct the error be- fore being allowed to enter the remainder of the program. The student is able to edit the program by moving a cursor through the program on the PLATO graphics screen. The com- piler moves with the cursor, compiling when the cursor moves for- ward in the program, and backing-up ("uncompiling," i.e., resetting the lexical and syntactical analyzers to previous states) when the cursor moves backward in the program. To facilitate the "uncompil- ing" capability, a parser-action history is created and maintained as the program is compiled; this history records any changes that are made in the parser's environment for each token as it is parsed. Then, to "uncompile" a token, it is necessary to undo only changes recorded in the history for that token. This parser history is also very useful for the error diagnostic system, since it contains a complete description of what actions the parser system has taken. A second design criterion for the compiler is that it be multilingual. To accomplish this, the compiler is completely table-driven; to allow another language to be recognized by the compiler system and used by students, a language designer must merely fill in a new set of tables and provide an execution super- visor system for the actual interpretive execution of compiled programs. A third design criterion for the compiler is that it pro- vide a high and sophisticated level of error diagnostics for the student. Since the intended users of the compiler are beginning student programmers, the error messages must be direct and to-the- point. A system has been designed and implemented to handle errors that are detected during the interpretive execution of a student's program [Dav., '75]. The present thesis is concerned with diagnos- ing errors that are detected during the compiling phase of program preparation. 1.3 Compile-time Error System Constraints and Goals One very severe constraint for the diagnostic system is that any messages and algorithmic analysis must reference only those tokens in the program up to and including the token at which an error was detected. In an interactive timesharing environment, unlike any batch-oriented compiler environment, right context for an error is not available for examination most of the time. This is because the interactive compiler detects and flags syntactic errors as soon as the programmer types in an offending token. The only time when any right context for an error is available is the situation where the programmer enters a part of the program into the system and then moves the cursor backward in the program ("uncompiling" the tokens that are backed over); if the programmer then attempts to make a syntactically illegal modification some- where in the middle of the program, all of the backed-over tokens are available as right context for the error. However, since the programmer was in the middle of making some change to the program when the error was detected, it is unlikely that the right context has anything to do with the newly-formed left context of the error; the input string will usually be in a transitory, incomplete form. Therefore, the general, more restrictive case that must be consi- dered is where no right context is available for analysis. Another constraint is that the error system must operate "automatically" and be as language independent as possible. To achieve this independence, the error analysis system must be able to determine any messages by examining only the input token string, any compiler symbol tables, and the parser table. To keep the com- piler system multilingual, no language-specific, ad hoc error anal- ysis techniques can be used. There are a number of goals for this compile-time error diagnostic system. One is that all diagnostic messages be under- standable to elementary student programmers. In particular, all messages must be stated in the terminology that is associated with the particular programming language being taught to and used by the student. This means that if things like "statements," "arithmetic expressions," and "array bounds" are important concepts in a pro- gramming language, then diagnostic messages should refer to these higher-level constructs when appropriate; however, the actual spe- cific detail pertaining to the higher-level constructs should always be available to the programmer if it is needed or requested. Another goal for the diagnostic system is that it must be highly interactive and responsive to the student. As previously mentioned, a high level of interaction will enable the diagnostic system to be more effective for the student than would be possible with a noninteractive system. In addition, all of the instructional lessons that a student takes on the PLATO IV system are highly interactive and require student input frequently to be completed successfully. The overall compiler system that has been described is very interactive and is constantly monitoring the progress of the student as a program is being prepared. Therefore, the diag- nostic system should also be interactive in order to conform with the rest of the student's environment. A further goal for the diagnostic system that it must be "feasible" for implementation within the PLATO IV system. The PLATO IV system (Alpert and Bitzer [Alp., '70]) was designed as a large-scale computer-aided instruction system to support as many as 1000 simultaneous users. The student terminal contains a plasma display panel as its main output component and a keyboard as its input component. The panel permits quick display of both textual information and arbitrarily complex graphic output. (This capabil- ity can be used very effectively in conveying useful diagnostic information about an error to a student.) However, because of the large number of users allowed by the PLATO IV system, the amount of central processor computing time allowed per user is very restric- ted. Therefore the error analysis algorithms used to determine diagnostic messages for the student must be fairly efficient and not require too much central processing time. The overwhelming goal and emphasis for this diagnostic system is that it behave like a "consultant," that is, be able to diagnose programming errors and help the student understand the errors. As much of this "consultant" as is possible should be built into the compiler system and be provided automatically for different programming languages. It is the activities and opera- tions of this "consultant" with respect to syntactic and context- sensitive semantic errors that is the topic of this thesis. 1.4 Organization of the Thesis Chapter 1 has presented an overview of the problem of providing good syntactic error diagnostics in an interactive com- piling environment. The characteristics of a particular student- oriented interactive compiler system were discussed and a few of the goals and constraints imposed on a diagnostic system by such a compiler system were mentioned. Chapter 2 will present a survey of some of the previous work that has been done on error correction and recovery techniques, Chapter 3 gives a description of the transition diagram parser model that is used by the compiler system and on which the proposed diagnostic consultant system is based. Chapter 4 presents a model of an interactive diagnostic system that is appropriate for use in an interactive compiler. Chapter 5 then describes the algorithms and operations of the proposed error analysis system, and some final concluding remarks are given in Chapter 6. 10 Chapter 2 SURVEY OF ERROR CORRECTION AND RECOVERY TECHNIQUES This chapter will present a brief survey of a number of papers in the area of syntactic error detection, correction and recovery. Other good surveys appear in the papers by Rhodes [Rho., '73, pp. 9-25] and LaFrance [LaF., '71, pp. 2-7], and LaFrance's paper contains an unusually complete bibliography of the relevant literature prior to 1971. It is important to note that each of the following sys- tems was intended for batch-oriented analysis, as opposed to an interactive timesharing analysis. As a result, the goal of each system is to properly diagnose the syntactic error, or at least make a guess as to the cause of the error, and then attempt to somehow "correct" the user's program by modifying the input string or perhaps adjusting the parser's stack appropriately, and thus allow the compiler to "recover" from the error and continue parsing the remainder of the program. As pointed out by Rhodes [Rho., '73, pp. 10-11], one of the more common error analysis techniques that is used, particu- larly in earlier systems, implements the so-called "panic mode" of error recovery. With varying degrees of sophistication, the "panic mode" recovery operates by popping the stack and advancing the in- put until a "safe" environment is achieved. This method is simple 11 to use, but the errors contained in that portion of the text which is skipped are not detected. The emphasis of the "panic mode" method is simply to provide a technique which allows the parser to recover from the error and continue parsing more of the input string. One of the earliest papers on an automatic error system came from E. T. Irons [Iro., '63]. Irons' parsing algorithm con- structed all possible legal parses for the input string in parallel. An error was signalled when none of the parses could be extended one more symbol. In this event, Irons' error system built a list of all the possible successor symbols from each parse directly be- fore the error point. Then the input was discarded until a symbol was found that appeared on the list. Finally, a string of symbols was created that, when inserted at the error point, would allow the parse to continue. This achieved the effect of local insertion, deletion or replacement of symbols to repair the error. A more sophisticated version of the "panic mode" method was proposed by Leinius [Lei., '70]. This work was primarily based on a simple precedence parser [Wir., '66]. The recovery system attempts to find the smallest substring containing the error which can be reduced, and then forces the appropriate stack reduction and input scanning to occur. Leinius also described the application of his technique to other parsing systems, and a master's thesis by L. R. James 12 [JaL., '72] discusses a detailed system using a bottom-up LALR(k) (lookahead LR(k)) parser [DeR. , '71]. As the parse progresses, "configuration sets" are produced; these configuration sets consist of the right-hand sides (RHS) of productions in the grammar (see [Aho, '72] for a description of grammars, productions, terminal and nonterminal symbols), with the following transformation: each RHS contains a distinguished symbol not appearing in the language (e.g., ".") which indicates how far in the RHS the parse has pro- gressed as of this configuration set. As input terminals are read, the configuration set is updated by moving the "." one symbol to the right in the elements which apply. When the "." is moved so that it is directly to the left of a nonterminal in a production, then this configuration set is pushed onto a state stack and the configuration set for that new nonterminal is used. When the "." reaches the far right of a RHS in the configuration set, reductions on the stack can take place and the elements in the configuration set which used the reduced symbol are updated. An error occurs when no RHS in the configuration set applies to the input symbol. The error algorithm attempts to re- cover as follows: the parser state stack is popped and the input string is skipped until a configuration set is found which is as- sociated with a nonterminal that can immediately preceed the next terminal symbol in the input. Finally the configuration set is updated to indicate that the nonterminal has now been parsed, and 13 the normal parse continues. This algorithm will always terminate since the entire program can be considered as one large string which reduces to the start symbol of the grammar. In the last few years, a number of error systems have been proposed that emphasize the "correction" of the error that is made, as well as the recovery of the parser. The systems generally attempt to examine the input tokens that are located near to where the error was detected and use this "context" as the basis of the correction that is made to the program. An interesting technique based on a simple precedence parser is proposed by Graham and Rhodes ([Gra., '73] and [Rho., '73]). As input symbols are read, they are either immediately stacked, or they cause the stack to be reduced and then they are stacked, depending upon the precedence relation between the input symbol and the symbol on top of the stack. Two types of errors can be detected in this system: either no precedence relation holds between the input symbol and the sym- bol on top of the stack (called a "character pair" error), or else a potential RHS is found, but no matching RHS exists in the grammar (called a "reduction" error). When an error is detected, the error system begins phase one (the "condensation" phase) of two phases (the other is the "correction" phase). The "condensation" phase marks the location of the error in the stack, for later reference, as 1 ? . Then a "backward" move is initiated by assuming that a > holds between the 14 symbol to the left of 1 ? and the symbol to the right of 1 ? ; this causes the stack to reduce if possible. Then the assumption is made that a < relation exists between the two symbols, and a "for- ward" move begins by passing control back to the parser. The "forward" move will end when another error is detected. (This may be a new error, or the reappearance of the old error in the form of a reduction error.) This new stack location is designated as 2 ? , and a "backward" move stack reduction is again performed. This ends the "condensation" phase. The "correction" phase then attempts to change the par- sing stack to correct the error situation. To do this, the "correction" phase will either replace or delete one of three sets of symbols on the stack: the set from the first < relation to the left of 1 ? up to 1 ? , the set from that same < relation up to 2 ? , or the set from 1 ? up to 2 ? . If it replaces one of these sets, it will replace it with a complete RHS from the grammar. The actual choice of what action to take is determined by a probabilistic pat- tern match, along with two "cost" vectors, which give the "cost" of inserting or deleting a given symbol from the input string. The total cost of deleting/replacing each symbol set is calculated and the cheapest is selected. (Note that the deletion/replacement must in turn produce a correct RHS on top of the stack.) The actual choice algorithm can be experimented with to yield the best results, as well as the entries in the "cost" vectors. 15 A system implemented by Lyon [Lyo., '74] operates in the following way. As the parser progresses, it constructs all possi- ble parses: this includes all legal parses as well as all possible error parses. For example, not only is a possibly legal parse maintained, but also for each input token that is read, the parser carries along a parse that assumes that the token is an extra in- serted symbol, as well as many parses that assume that some other symbol was deleted from in front of that token. In each parse, a count is kept of the number of errors that have occurred for that parse. When the input is exhausted, the parse which has the smal- lest error count is selected as the final parse; this algorithm is known as a least-errors recognizer. The results from Lyon's analy- sis indicate that his algorithm uses excessive amounts of execution time and storage space, thus relegating it, in its present form, to a theoretical rather than practical category. Lyon's implementation is actually based on a system pro- posed by J. P. Levy [Lev., '71]. However, Levy's proposal is somewhat different from the system that Lyon developed. Levy de- scribes a technique for moving back in the input a certain distance when an error is discovered, and then reparsing forward towards the error point as described in Lyon's algorithm; some heuristic rou- tines then select one of the error reparsings to correct the pro- gram. It seems likely that this technique will eliminate the combinatorial problems encountered by Lyon's system. 16 An error system was proposed by J. E. LaFrance in 1970 ([LaF., '7 0] and [LaF., '71]). His system is based on a Floyd Pro- duction System parser [Flo., '63]. Briefly, this parser contains groups of productions. As the parse progresses, the parser moves from production group to production group; each group contains references to a set of productions that should be attempted at this point. If none of the productions apply, the error condition is raised. LaFrance' s error system then generates a list of all pos- sible two or three symbol "expected" strings from the production groups, beginning at the error point. Then each of these "possible" strings is compared (matched) to the next four symbols actually in the input string. The matching process consists of about twenty different comparisons such as "assume one symbol is missing" or "assume two symbols are interchanged." If a match is found, the source input string is repaired appropriately and the Floyd parser restarted. If no match is found, then the input string is discar- ded until a symbol is found which can follow a symbol in the stack; the stack is then reduced and the Floyd parser starts up at the place where it should be when the stack has just been reduced to that symbol. LaFrance 's system is very good in a number of ways. It is a relatively efficient technique, since suitable encodings of the "possible" and the actual input strings allow the matching algorithm to be implemented efficiently. Also, the results of LaFrance' s work indicate that corrections that are made to the 17 program are frequently the same corrections that the programmer would make in a similar situation, particularly for those error system corrections that were determined by a successful match having been found. However, if no match is found, LaFrance's sys- tem reverts directly back to the "panic mode" form of error recovery, with a corresponding degradation in the appropriateness of the corrections performed. Another criticism of LaFrance's system is that it examines only a few of the input symbols near the location of the error symbol, and ignores all the other context farther away from the error symbol. Examples can be given (for example, Rhodes [Rho., '73, pp. 25-26]) of error situations that occur in programming languages that cannot be properly corrected by LaFrance's technique. One final system that should be mentioned is that of James and Partridge [JaE., '73]. They produced a system that uses many of the analysis techniques suggested by LaFrance, but with a strong adaptive slant: their system is designed to be able to mod- ify itself somewhat to become more efficient or more effective. Their parser utilizes a binary tree structure, where each node in the tree contains some information plus a success and a fail link. The parser moves through the tree as the input is read, comparing the input symbol with the current node information and either tak- ing the success or the fail link to another node. The system can restructure its own binary parse trees so that the most used path 18 occurs first in the tree. An error is detected if a fail is indi- cated, but the current node has a null fail link. This invokes the use of a separate "strategy" tree for error analysis. The strategy tree contains information nodes that suggest different matching strategies (very similar to those suggested by LaFrance). The system can also restructure the strategy tree so that more effective match strategies come first. In summary, all of the techniques mentioned in this chap- ter were designed to operate in a batch-mode environment, where tokens both to the left and to the right of an error token are available for examination and analysis. Because of this, none of the systems can be directly applied in the interactive compiling environment described in section 1.3 of this thesis in which no right context is available for analysis. The "panic mode" systems are particularly ill-suited, since they operate primarily by skip- ping forward in the input until a token is found that can occur some- place after the actual error token. LaFrance' s system (and also James and Partridge's system, since it is based on the LaFrance tech- nique) is likewise poorly suited for this restrictive interactive environment since it analyzes the next few tokens following the error token. The works of Rhodes and Levy, however, are somewhat better suited to this environment, since both techniques begin by examining a substantial portion of the left context before the cor- responding right context is analyzed. The work of Levy, which had 19 a large influence on the current thesis, is further described in section 5.2.1 . This concludes a look at some of the previous work in the field of error correction and recovery systems. 20 Chapter 3 THE TRANSITION DIAGRAM PARSING MODEL 3.1 Introduction The syntactic analyzer used in this interactive compiler is based on the transition diagram systems first introduced by Conway [Con., '63], and later formalized by Lomet [Lorn., '73]. The second section of this chapter will describe the most important characteristics and operations of a transition diagram parsing model; the third section will present a brief, more formal descrip- tion of the parsing model defined by Lomet; the fourth section will discuss the actual parser table implementation that is used in the compiler system and which is referred to by the automatic error diagnostic system; the fifth and final section will summarize the parsing environment that exists when a syntactic error is detected. For more detailed information, the reader is referred to the above papers by Conway and Lomet, as well as a paper by Tindall [Tin., •75]. 3.2 General Characteristics of the Transition Diagram Model A transition diagram parser can be thought of as an expan- ded (stack-oriented) version of a simple finite state machine (fsm), A fsm maintains a "node" or "state" pointer, that is, at any time, 21 the fsm is said to be in some "state" waiting for the next input symbol to come in. When the next input symbol has been read, based on the attributes of the new symbol, the fsm then makes a "state transition" to a new state and waits for the next input sym- bol to be read. An entire input string is accepted if the input is exhausted and at the same time the fsm is in one of a set of special "final" states. Similarly, the action of a transition diagram parser is to examine the "possible" or "acceptable" parsing options (deter- mined by the current STATE and the transition diagrams) along with the current input token. Based on this information, the parser will accept the token by updating the STATE information and asking for a new input token, or reject the current input token and signal a syntactic error if the token does not satisfy any of the available parsing options. Note that for the transition diagram model being de- scribed, all of the parsing options are defined in terms of "tokens" only, that is, the tokens that are accumulated and output from a lexical analyzer. This convention will remain the same for all of the work that is reported in this thesis unless otherwise stated. The parsing of input strings described in the rest of this chapter, as well as all of the error analysis manipulations dis- cussed later in the thesis are all based on a "tokenized" input string. 22 The actions associated with this transition diagram model can be conveniently shown graphically as follows: Let denote STATE "SI"; each branch out of a STATE corresponds to a pos- sible syntax parse option for that STATE: these branches will always be labeled with their particular syntactic option require- ments. Figure 3.1 is an example of a simple transition diagram El. This is interpreted as: if the parser is in STATE SI and the current input token is "a," make the state transition to STATE S2. Otherwise, if the input token is "b," make the transition to STATE S7. Otherwise, an error has been detected, and the parsing error condition should be signalled. STATE S6 is a "final" state for diagram El, as indicated by the null branch leading out of the S6 node: if the parser is in STATE S6 and the input is exhausted, then the entire input string is acceptable to the parser. The ac- ceptable input strings for this transition diagram are "abcde," "ade," and "bfe." El 23 ^S)J Figure 3.1 - Simple transition diagram example A simple practical example is a form of a PL/I "GOTO" statement, which is shown in transition diagram form in Figure 3.2. Note that an input token can be referred to either literally or as some other combination of attributes such as [label -name], where [label -name] has an appropriate meaning in the parser environment. "GOTO" 'GO ■^y TO" [label-name] ^-©- Figure 3.2 - PL/I "GOTO" statement transition diagram 24 So far in this discussion, all of the examples have been completely parsable within the framework of a simple finite state machine. However, the actual transition diagram model allows the label on a branch out of a node to refer to not just a single input token, but also a group of tokens that are specified by some other transition diagram within the parser system. Figure 3.3 is an example of this more complex form of a transition diagram. In this case, if the parser is in STATE S2 in transition diagram E2, then the parser will save STATE S2 on a stack in the parser's environment and refer to diagram E3 to deter- mine the attributes of the next acceptable input token. When the "final" state for diagram E3 is reached, the parser will then have completely accepted the input string specified by E3, and there- fore, upon popping STATE S2 off of the parser's stack, will be able to legitimately make the state transition to STATE S3 and continue to parse as normal. 25 E2 E3 Figure 3.3 - Invoking transition diagram example Although not explicitly illustrated by Figure 3.3, the "called" transition diagram (E3 in Figure 3.3) could refer to other transition diagrams in the system, or could in fact refer to itself. (However, E3 cannot be referred to from STATE S6, that is, "left recursion" must be avoided.) A practical example that requires the invoking of other transition diagrams is the PL/I "IF" statement, shown in Figure 3.4. Note also that this transition diagram system can accept an entire "IF" statement either with or without the "ELSE" clause. 26 cT^ "IF" ( £2 ) conditional -expr> S3) jaaaL: « ^ /^TN "ELSE" /^N €H Figure 3.4 - PL/I "IF" statement transition diagram One more example is needed to illustrate another impor- tant feature of transition diagrams: a transition diagram system which has been invoked by another transition diagram has the capa- bility of returning to one of a number of possible STATES in the invoking transition diagram. Figure 3.5 is an example of this situation. If the parser is in STATE SI, then diagram E5 is in- voked normally using the parser's stack, and either input "a" or "c" is accepted, moving to STATE S7 or S8 respectively. Note that the parser is now in a "final" state for diagram E5; however, the small number given with the final state indicator specifies that it is possible to return to two different STATES in the invoking 27 diagram E4. These two return points are specified by the Rl and R2 labels following the invoking of diagram E5. When the parser returns from diagram E5, it will either return to STATE S2 or STATE S4, depending on which return is specified. Thus, transition dia- gram E4 accepts both "ab" and "cd" as valid input strings, but does not accept either "ad" or "cb." E4 E5 SI "c" Rl S7 S8 S2 S3 ' s4 V^ 1 S5 Figure 3.5 - Multiple return transition diagram example A good example of where this multiple return feature is necessary arises in trying to parse a (simplified) PL/I conditional expression as shown in Figure 3.6. 28 simple expression parentheses ( ( ( I + 100 ) * J ) = K ) conditional expression parentheses Figure 3.6 - PL/I conditional expression example When the initial left parentheses "(" are examined, it is not known whether they are part of the overall conditional expression or are part of the inside simple expression on the left side of the rela- tional operator. A transition diagram for a is shown in Figure 3.7. Note that the way in which this has been drawn corresponds to assuming that the initial "(" belongs to the overall conditional expression; if it turns out they actually belonged to the first simple expression, then the parse is resumed at point 6, which accepts the ")" and continues parsing the simple expression. 29 conditional -expr> Rl R2 €y^-€y €> ii \ ii rrelatiQnal-oper]^c6) ^xprpssi^/^o/Y-, Figure 3.7 - PL/I conditional -expression> transition diagram 3.3 Formal Description of the Transition Diagram Model As a more formal description, a transition diagram system consists of a set of nested push-down automata (NPDA) that have the capability of invoking one another. Each NPDA is capable of read- ing a portion of the input string and accepting or rejecting it. Lomet [Lorn., '73] calls the NPDA that are capable of being invoked "submachines"; the initial STATE of a submachine is known as its "entry" STATE and a submachine is invoked by the use of its entry STATE number by another NPDA. This results in the invoking STATE being saved at the top of a parser stack and the parse resumed at the new entry STATE. Each submachine also contains one or more 30 "exit" STATEs; when an exit STATE is reached, the top of the stack together with the particular exit STATE determine the STATE in the original invoking NPDA with which to continue the parse. An error in the parse is detected if a NPDA reads a token from the input string for which there is no corresponding STATE transition that the NPDA can make. The reader is referred to the discussion by Lomet [Lorn., '73] for further technical details about transition diagrams and the class of languages accepted by transition diagrams 3.4 Description of the Implemented Parser A table-driven parser system has been designed that in- corporates the actions described in the preceding sections of this chapter. The syntax table is actually an encoding of the transi- tion diagrams into a small, interpretable instruction set. The syntax parser of the compiler consists of a routine that interprets this instruction set in an appropriate way. (This routine can be called the "table interpreter" or "table-driven" routine.) The syntax parser has control over and maintains certain tables and data structures within the compiler. All of these tables and structures are located in a region of memory that is referred to as the parser storage area. One particular variable that is maintained is the parser STATE variable. This STATE variable corresponds directly to the nodes or states that have been described for a transition diagram. 31 As each token is accepted by the parser, it is associated with the particular STATE that the parser was in. If a token must be "un- compiled," then the parser is reset to the associated STATE for that token, that is, as if the token has not yet been parsed. The value of this STATE variable always points to some instruction in the syntax table. As the input string is parsed, this STATE vari- able is updated (i.e., moved to point to a different instruction in the table) to reflect the current state of the parse. The parser maintains a regular parsing return-state stack which is used when one transition diagram invokes another. Entries on this stack all point to the location in the parser table where the nested transition diagram was invoked. Any changes in the parser stack are recorded in the parser-action history for the token that is currently being parsed, allowing the stack to be pro- perly reestablished when a token is "uncompiled." The preceding discussion in this chapter describes a parser model that is sufficiently powerful to recognize all deter- ministic context-free languages in time proportional to the length of the input string [Lorn., '73]. A few extensions to the model have actually been included in the compiler's parser system to en- able the handling of certain context-sensitive semantic require- ments such as proper and consistent declaration of attributes for an identifier, consistent references to declared array identifiers, etc. These extensions include allowing auxiliary memory variables 32 to be utilized by the parser (for operations such as counting the number of subscripts in an array reference), allowing the labels on any branch in a transition diagram to refer to any of these auxil- iary variables or any symbol table field, and allowing special, language-dependent semantic-subroutines to be invoked at appropri- ate times during the operations of the parser system. Any changes that are made to any of these auxiliary variables or symbol table fields are recorded in the parser-action history for the token that is being parsed. A complete description of the design and operations of the actual implemented parser can be found in [Tin., '75]. 3.5 Compiler Environment when Parser Signals Error As stated previously, a syntax error is detected by the parser if the attributes of the current input token do not match any of the labeled branches for a STATE. When this happens, the following environment is assumed by the error diagnostic system that is described in the remainder of this thesis. The intermediate text generated by the compiler is \/ery simply a tokenized version of the source text. The advantages of this are that only one copy of the program need be stored, "reverse compilation" or "uncompiling" can be implemented easily without complicated reversal of code generation, and the original program can be displayed at any time simply by examining the intermediate 33 text. As each token is output from the lexical analyzer, it is entered into the rightmost end of the intermediate text. Thus, if a syntax error is detected, the last token in the intermediate text is always the illegal one that the parser could not accept. In addition, no right context is available for examination; only the intermediate text entries up to and including the error token are available. The parser-action history is a vector that is associated with the intermediate text. Each entry in the history records some change that occurred in the parser environment. A correlation is maintained between entries in the history and tokens in the inter- mediate text. When a syntax error is detected, the current parser STATE is available, as well as the history of all STATE changes that have occurred for the previous tokens in the intermediate text. Simi- larly, the current parser return-state stack is available, as well as the history of all stack changes that have occurred. Finally, the current symbol table and the auxiliary parser storage variables that are used by the parser for the particular language are avail- able; all changes to these data areas are recorded in the history. This concludes the summary of the parser environment that the error analysis routines can work with. 34 Chapter 4 A MODEL FOR DIAGNOSTIC INTERACTION WITH THE PROGRAMMER 4.1 Goals for Diagnostic Interaction This chapter describes a useful model of the interaction that can occur between a programmer who has attempted to enter a syntactically illegal token into the compiler and the error diag- nostic system that is trying to describe the error to the programmer. Because the programmer is assumed to be relatively inexperienced at writing and debugging problems, the diagnostic system must be very simple to use. All of the actions taken by the diagnostic system must be transparent to the programmer. In par- ticular, any messages that are displayed for the programmer must be simple to understand and must be stated using the terminology that the programmer is familiar with. Concepts such as "statements," "subscripts," "expressions," etc., which are important in the pro- gramming language and which are presumably recognized by the programmer should be referred to when appropriate by the diagnostic system. In addition, the diagnostic system should make use of the fact that the programmer is at a timesharing terminal and thus can be used to control some aspects of the error analysis and recovery operations. In particular, the programmer should have some control over the level of detail that is given in diagnostic messages. If 35 a "numeric expression" is referred to in a message, for example, the programmer may or may not be interested in more detail about the form of a "numeric expression"; while the more detailed infor- mation should always be available, it should not always necessarily be presented. Likewise, as opposed to most batch-oriented compilers, the interactive compiler's diagnostic system needs to make no attempt to actually "correct" the error or to modify the program and "recover" by itself. Since the programmer is directly avail- able, the system should attempt to explain the error to him, re- quiring however that he be the one to actually back up, edit, and repair the program. 4.2 Suggesting "Possible Corrections" for the Programmer To satisfy the goals for diagnostic interaction just mentioned, we propose a diagnostic system that interacts by pre- senting various suggestions of "possible corrections" of the program to the programmer. After a suggestion has been presented, the programmer then has control to either request that a different or more detailed suggestion be made by the system or else to return from the diagnostic system back to the editor system and proceed to modify the input and attempt to correct the error. This technique is y/ery simple for the programmer to use since the only interaction that is required from him is that he examine the suggestions that 36 are made and either request other suggestions or request the return to the editor system. The example shown in Figures 4.1(a) - (d) illustrates the types of suggestions that can be made by the diagnostic system. It is an example of a short PL/ 1 program segment in which the illegal token "THEN" has been entered at the end of an assignment statement by the programmer. Each successive display (a) - (d) was requested by the programmer via the "HELP" key on the PLATO keyboard. In all cases, upon detecting the "HELP" request, the diagnostic system erased the current suggestion message that was being displayed and presented the next message to the programmer; the program itself was always in view on the screen, with only the diagnostic messages being changed. FILE = workspace PL/I WORKSPACE (1 4 -new) SFBCE » 260 y v v \r r v w " p l v w r v r r v r w — r~ — t .TEST10: .PROC; DECLfiRE. (I, J) .FIXED; .LP1: GET. LIS T (I, J ) ; I .-.J. . ffiHENJ ****#**#**ERROR* ********* [back] or (twisi) to fix. This reserved word is not permitted here. Press jhelp) for more information. Figure 4.1(a) - Example of suggestion messages 37 FILE = ujo rk sf ace PL/ 1 WORKSPACE (1 4 - new) SPACE = 260 " '' " • r ' r w r r v r r w r r v ' r • r r r v v lC= y TESTIS: .PROC; DECLARE . (I,. J) .FIXED; LPl: . . . .GET, LIS T (I, J) ; I.-.J. . ffHETl ********** Possible Correction ********** (back) or (ehhse) to fix. Replace | J with an arithmetic operator. Press (heTJp) for a different suggestion. Figure 4.1(b) - Example (continued) FILE = workspace PL/I WORKSPACE (1 4-new) SPACE =2 60 TEST10: .PROC; DECLARE. (I, J) FIXED; .LPl: GET LIS T (I, J) ; I.-.J. . [fHENJ ********** Possible Correction ********** (back) or (ebhse) to fix. Replace P~] wi i h _"i"l Press (help) for a different suggestion. Figure 4.1(c) - Example (continued) 38 FILE * workspace PL/ 1 WORKSPACE (1 4-neui) SPACE = 260 111 " r — r r r ' r . w v w r • w • w w w • r — r r r i> I TESTIS: PROC; DECLARE . (I, J) .FIXED; LPl: GET. LIST (I, J) ; g.-.J. . .THEN ********** Possible Correction ********** [Back] or Ekhse) to fix. Insert a reserved word "IF" in front of | | . Figure 4.1(d) - Example (continued) Note also that all of the available options for the programmer are displayed with each message: in this example, the "HELP" key is active for each message as well as the regular com- piler editing "BACK" and "ERASE" keys which allow the programmer to modify and correct the program. A further example of the control that the programmer has over the suggestions is given in Figures 4.2(a) - (e). In this PL/I example, the programmer has attempted to declare identifier "J" with conflicting attributes. (The "VARYING" attribute is associated with character strings in PL/I, not decimal variables.) The suggestion given in Figure 4.2(b) includes a different option for the programmer: if another suggestion is needed, the 39 programmer can either request to see more detail about "attributes" that are consistent with "DEC" (the detailed suggestions are shown in Figures 4.2(c) and (d)), or else can request that some other different suggestions be made (Figure 4.2(e)). Note also that if the programmer elects to see more detail, then following the sugges- tion in Figure 4.2(d) (or by requesting a different suggestion in Figure 4.2(c)), the next suggestion is the one shown in Figure 4.2(e); therefore the programmer will not miss seeing any sugges- tions because of electing to see more detail on an earlier suggestion. Further examples are given in Appendix II. FILE * workspace PL/ 1 WORKSPACE (1 4 -new) SPACE = 276 1 r r== r r r r- r 7 ? r v r w v r 9 r v t f .TEST.: . .PROC; DCL . I .PLORTj DCL . .B DEC JVRRYINGl ♦ ♦#**♦** #*ERROR#********* • ( back) or (erhse) to fix. This attribute is not permitted here. Press (help) for more information. Figure 4.2(a) - "Detail" suggestion messages example 40 FILE = workspace PL/I WORKSPACE (14 -new) SPACE = 276 TEST.: . .PROC; DCL . I .FLOAT; DCL . .B DEC . fWRYINGi ********** Possible Correction ********** (back] or [erhse) to fix. Replace | | with an attribute. Press (help) for a d i f f erent suggest i on . Press EEfrHELP) to see a legal attribute. Figure 4.2(b) - "Detail" example (continued) FILE = workspace PL/1 WORKSPACE (1 4 -new) SPACE = 276 — r r r y >.' r r ' r r ] r l w • w r r r v w w i TEST.: . .PROC; DCL . I .FLOAT; DCL . .B DEC j/ARYINGl ********** Possible Correction ********** @ or [erhse] to fix, Rep lace _Q ] with_an attribute 2 F t°5 T _' , _ Press (help) for a d i f f erent suggest i on . Press SHDldlkD to see another legal attribute. , Figure 4.2(c) - "Detail" example (continued) 41 FILE ■ wo rks pace PL/I UIORKSPflCE (14-new) SPRCE * 276 TEST.: . .PROC; DCL . I .FLORTj DCL . .B DEC . jVRRYINq ********** Possible Correction ********** (brcx) or [mat] to fix. Replace P~] with an attribute "FIXED". Press IHEifl for a different suggestion. Figure 4.2(d) - "Detail" example (continued) FILE « workspace PL/I UIQ >^<5PRCE(14-new) SPACE = 276 * ¥ * 4 r r r TEST.: ..PROC; DCL . I FLOAT; DCL . B DEC . yflRYINGj ********** Possible Correction ********** §55*1 or E^ to fix RyUce j _ J wi th • , " . Press £E£ for a different suggestion. Figure 4.2(e) - ■Detail* example (continued) 42 4.3 Algorithmic Notation Short algorithms will be presented throughout this thesis. The notation that will be used in stating these algorithms will be a readable, Algol-like language that contains whatever language constructs are necessary to most clearly illustrate the particular algorithm being presented. The left arrow "«-" is the assignment operator. Typical control constructs include an "if - then - else" statement (which is terminated with a "fi" in all cases to avoid ambiguity), a looping construct (generally "loop. . .while. . .re- peat"), and a procedural "begin - end" construct. Also, in many cases, certain operations in the algorithm will not be refined ex- licitly when the algorithm is given; these undetailed operations include both references to other procedures in the error analysis system (denoted with double slashes "//" around a description of the procedure) and also local operations that must be performed (denoted with single slashes "/" around a description of the effect of the local actions). The notation used in these algorithms is not intended to be rigorously defined; the goal is for the operation and effect of the algorithm to be stated as clearly as possible. 4.4 Algorithm for Diagnostic Interaction An algorithm that provides a suitable model for the diagnostic interaction described in this chapter is given as 43 Algorithm 'USER_INTERACTION' in Figure 4.3. This algorithm is purposely simple and nondetailed; Chapter 5 will more fully de- scribe a detailed implementation of this algorithm. begin USER_INTERACTION /initialize error system environment/ //display: signal error to user// loop : //prepare new suggestion message// comment "Read- edit- key" waits until an edit key or the "HELP" key is pressed; the key-type is returned in "Read- key." /Read-edit-key/ while Read-key = "HELP": //erase: current message on screen// //display: new message on screen// repeat : /reestablish program editing environment/ /return to program editing mode/ end USERJNTERACTION Figure 4.3 - Algorithm 'USERJNTERACTION' 44 Chapter 5 DETERMINING "POSSIBLE CORRECTION" SUGGESTIONS 5.1 Introduction The largest problem to be solved in algorithm "USER_INTERACTION" (shown in Figure 4.3) is that of determining what "possible correction" suggestions can be made to the program- mer as diagnostic help is requested. This chapter first describes a preliminary model of an algorithm for determining suggestions that is similar to the algorithm suggested by Levy [Lev., '71]. Then a more sophisticated algorithmic model is given and shown to produce more comprehensive suggestions than the preliminary model. 5.2 A First Model for Determining "Possible Correction" Suggestions 5.2.1 J. P. Levy A thesis by J. P. Levy [Lev., '71] defined and studied a number of problems associated with automatic error correction techniques. If the "error token" is defined to be the input token at which a syntactic error is detected, then the "actual syntactic error" that has occurred is not necessarily located at the error token, but may in fact be located someplace to the left of the error token. Levy proved (Theorem, section I I I. 3) that the actual error may not even be located "close" to the error token, but could have occurred an unbounded distance to the left of that token. 45 However, for any particular error situation, Levy showed it is possible to find a location in the input string such that the actual syntactic error could not have occurred before that location: this location is defined as the "left context" of the error. (Locating this place in the input string, which is called a "beacon," is a difficult problem and Levy devotes much of his the- sis to its solution.) The fact that such a "left context" location exists is trivially true, since the extreme limit for its location is the beginning of the input string; however, Levy shows that it is possible for the left context location to be located much closer to the error token than the beginning of the input string in many situations. Then Levy defines "error interpretations" of the input string to be syntactically correct strings that are similar to the original input string, but contain a few changes or "corrections"; if the number of changes made to the original string to arrive at an interpretation is K, then the original string is said to have K errors. He shows that all interpretations with K changes will be- come "equivalent" at some location to the right of the error point if it is possible to interpret the string correctly with only K changes; this equivalence means that the parser has settled into a common state for all of the interpretations at that point, and thus the parse for the rest of the string is similar for all interpre- tations. 46 Levy defines three operations that are used in building the error interpretations: "R," "A," and "S." The "R" operation specifies that, at each point in the input string, alternate parses (interpretations) should be created by replacing the actual input token with each of the tokens that would be syntactically acceptable at that point. The "A" operation specifies that, at each point, alternate parses should be created by adding to the in- put string each of the tokens that would be syntactically accept- able. Finally, the "S" operation specifies that, at each point, an alternate parse should be created by suppressing from the input string the actual input token that was there. These three operations can be conveniently expressed as follows: if the input string at some point is expressed as xa where x is the first token not read by the parser and a is a string containing the remainder of the input tokens, then if t is an acceptable token at this point, the "R" operation changes the input to R t (xa) = ta, the "A" operation changes the input to A t (x a ) = tx a , and the "S" operation changes the input to S t (xa ) = a . 47 Levy proves that all syntax errors can be corrected by some combi- nation of these three operations. Finally, Levy describes an error correction algorithm that operates as follows: following the detection of an error, back up in the input string until the left context location is reached; then reparse the input string from this point forward cre- ating all possible modified interpretations of the input string with N or fewer changes (for some fixed N) until all of the inter- pretations are equivalent, that is, until the right context for the error is found; then select one of the interpretations with the smallest number of changes as the "correction" for the program. 5.2.2 Modifications to Levy's Analysis The algorithm described by Levy can be modified to make a good preliminary model for determining "correction" suggestions when used with a transition diagram parsing system. However, a number of important changes must be made to apply Levy's basic ideas in an interactive environment. For one thing, in Levy's system, all of the interpretations had to be initially computed for the error situation, and then a "best" interpretation selected and used to correct the program. However, in an interactive environ- ment, it is not necessary for the error system to guess which error interpretation is the "best" interpretation; the interpretations can be given successively to the programmer, and the programmer can 48 then select the appropriate interpretation and use it to correct the program. In this way, the error system is acting not so much like an automatic error correction and recovery system as like an automatic error "consultant." When an error is detected by the compiler, the error system then provides the programmer with the different possibilities for correcting the error. Since the different error interpretations are going to be given successively to the programmer, it is not necessary for all of the interpretations to be computed immediately upon the de- tection of an error by the compiler. In an interactive environment, it is important to try to spread the computing load out over a period of time if possible, and to avoid heavy bursts of calcula- tions in short periods of time. Also, since the programmer will frequently examine more than one error interpretation for an error situation, it is desirable that the suggested corrections to the program be given to the programmer in an organized order. It is proposed that instead of the error system initially moving back in the input string to the left context location and then starting the computations of the error interpretations, the error system should compute the interpretations in the reverse order. Initially, the error system will compute all error inter- pretations assuming that the entire input string up to the error token is correct and the error token itself is in error. As dif- ferent interpretations are found, they are presented to the 49 programmer as in algorithm "USER_INTERACTION" (Figure 4.3). When no more interpretations can be found from this input location, then all interpretations (if any) will be computed from one token farther back in the input string, and so on. At each of these new locations the algorithm makes the assumption that the error token itself is correct and that the "actual syntax error" occurred some- place back in the input string. The backup process should terminate when a token is reached in the input string where it is certain that the error act- ually occurred after that token. Levy defines this left context location as a "beacon," and he thoroughly discusses the problems associated with locating the beacon for a given error situation. The reader is urged to consult this work for the details; this current thesis simply assumes that such a "beacon" location can be found. This algorithm is intuitively good from the programmer's point of view, since the more "local" corrections are suggested first, followed later by more "global" suggestions. From a compu- tational point of view, this algorithm is also by far the best approach. It is wery easy to compute the initial error interpreta- tions, but much more time-consuming to calculate interpretations with modifications that occur far back in the input string. The hope is that usually the initial interpretations and suggestions will satisfy the programmer and, while always available, only occa- sionally will more complex interpretations need to be determined. 50 Finally, note one more major departure from Levy's system. In the interactive compiling environment that was de- scribed in the previous chapters, there is no right context avail- able when an error is detected and analyzed. Therefore, although the input context to the left of the error token is still available and important, there is no context to the right of the error token in the input string. Because of this, the error interpretations as described by Levy are "equivalent" in the interactive environ-, ment when all of the input string up to and including the error token has been considered. Another way of describing this is to say that the error system can compute error interpretations of the input string that just consider the left context and the actual error token; it is then the responsibility of the programmer to (mentally) supply the intended right context (that has not yet been entered into the program) to decide if a particular error interpre- tation constitutes the appropriate correction to the program. Thus, the overall diagnostic system in an interactive environment re- quires the joint participation of both the analysis system's inter- pretation routine and the analysis of the programmer to satisfactorily diagnose the syntax error. 5.2.3 Token Option Lists Before the preliminary model of the "suggestion builder" routine can be given, one more important routine must be mentioned. 51 It is possible, for each state in a transition diagram, to con- struct a list containing all of the tokens that are acceptable to the parser if it is in that state. This list corresponds to what Griffiths [Gri., '74] has called a "director symbol set" in LL(k) terminology; it will be called the "token" option list in this thesis, and the routine that builds the option list for a state will be called the OPT routine. It is very important to realize that the acceptable token list for a state depends not only on just the transition diagrams themselves, but also on a particular complete parser en- vironment; the token list for a state will vary depending on other elements of the parser's environment. If all of the branches in the transition diagram that leave state S are labelled with just token specifications, then those tokens comprise the entire list for state S. However, if one of the branches invokes another transition diagram (with entry state Q), then any simple tokens that are specified for the original state, S, must be added to the list, and then the entry state Q for the invoked transition diagram must be examined as above. Likewise, if state S is a final state for some transition diagram, then any simple tokens that are spec- ified for state S must be added to the list, and then the current transition diagram exited according to the parser's stack at that time . Eventually, all of the possible tokens will be resolved and added to the list, since the parser accepts only actual tokens in the input string. 52 begin transit (T, S) comment T is an input token, and S is a parser state. comment transit is a function that returns a value. The value is a new state number if a parser state transition from STATE S with token T can be made. Otherwise the value is "null." comment The function is not detailed here since the explicit actions depend directly upon the compiler's imple- mentation. end transit Figure 5.1 - The "transit" function 5.2.4 The Preliminary Model Given that this token option list can be determined for any state, the preliminary algorithm consists of three routines: "MODIFY," "REPARSE," and "USERJNTERACTION." MODIFY will make a modification to the input string and invoke REPARSE to determine whether the corrected string is then acceptable. If so, REPARSE will invoke USER_INTERACTION; otherwise it will recursively invoke MODIFY to make further modifications if the current interpretation is still not acceptable. Both MODIFY and REPARSE use a function "transit." A 53 description of the effect of the transit function is given in Figure 5.1. The function accepts two parameters: T, an input token, and S, a state of the parser. The function checks to see if a legal parser state transition can be made from STATE S with the token T. If so, then transit returns the value of the new state number; otherwise transit returns a "null" value. The MODIFY routine has three input parameters: L, a pointer to some location in the input string (indexed from the beginning of the string), Q, the corresponding parser state that is expecting the token at location L, and K, a counter indicating the number of changes that have currently been made to the input string. When the MODIFY routine is activated, first the token option list for state Q is constructed. Then, for each token option in the list, two modifications can be made: Add the token option to the input string in front of the token at location L, or Replace the token at location L with the token option. After the above modifi- cations have been tried for each token option, a final modification consisting of Suppressing the actual token at location L from the input string can be tried. Further describing these operations, if the MODIFY 54 routine tries an Add operation, it will try to add the token option to the input string in front of the token at location L; if the original input string is > error_token_location then "success"; else comment Extend reparse attempt 1 token if next token is acceptable. if transit (token (LR), QR) f "null" then //REPARSE (LR + 1, transit (token (LR), QR) , KR)// II comment Make another modification at location LR, if possible, to build all legal interpretations. if KR < N then //MODIFY (LR, QR, KR)// fi II end REPARSE Figure 5.3 - REPARSE routine 5.2.5 Examples using the Preliminary Model As an example of the analysis performed by the preliminary model that has been presented, consider the transition diagram El 59 shown in Figure 3.1. For transition diagram El, Table 5.1 shows the token options that can be computed for each of the different states. A few specific unacceptable input strings, along with some possible interpretations for these strings, are shown in Table 5.2. This table gives the original token input string and then a number of different "corrected" strings; for each "corrected" string interpretation, the number of modifications required to ob- tain the error interpretation is given, as well as a sample "sug- gestion" message that could be used to describe the modifications that were made. STATE TOKEN OPTIONS "b" "b," "d" "c" "d" "e" none ii .pi SI S2 S3 S4 S5 S6 S7 Table 5.1 - Token Options for States of Transition Diagram El 60 E rO o S- 4- • • • • • z • T3 z E CD _Q o ai M- +-> a> +-> jQ £ x> o = = = x: +-> = c o = o +-> - rO z S_ 4- .E .e x: sz s- x: t_ •r- X z X: +-> -t-> +-> E +-> 4- +-> 4- 2 +-> #i -4-> E • ^ • r™ • r~ O •r- •^ •^ O •i— "i — 2 s s S- 3 e s c z 2 z 2 _ 4- •r— •r™ 4- = 4- rO - u -a 4- z u — ■b z ~ai n z z z rO z ro z r0 CL) z XJ z z z z z O z • +J at CD a> cu a; rO OJ o +J n— o +-> z o a» rO rO ro > ro S- rO S- Q. ro S_ rO CO ^~ ^~ ^~ o i^ E ai e •r" CO E fO s- o Q a. r^ o oo a; lu UJ CD CM CO CM to E O •r- +J rO 4-> CD s_ Q. S- i. o s- o_ a: a: O LxJ OL \— a: : CD -o T3 CD U o O Cut o O CD "O J3 JD 4- 4- .a 4- JD XI XI 4- fO fO fO JD J2 ro J3 ro ro ro XI I CM LO 0) x> ro CD t— CO o ; u A3 X) rO U XJ CD ro 61 STATE TOKEN OPTIONS 51 "a," "b" 52 "f," "b" 53 "d" 54 "e" 55 "c" 56 "f," "b" 57 "g" 58 "c" 59 "d" SIO none Table 5.3 - Token Options for States of Transition Diagrams E2 and E3 As a second example of the analysis performed by the diagnostic model, consider the more complex transition diagrams E2 and E3 shown in Figure 3.3. Table 5.3 shows the different token options that can be computed for each state in the diagrams; notice that to obtain the options for state S2 requires examining the token options for the entry state for diagram E3, state S6, and similarly to obtain the options for state S9 requires returning from diagram E3 back to state S3, and then examining the token op- tions for state S3. A few unacceptable input strings for the diagrams E2 and E3, along with some possible error interpretations for these strings, are shown in Table 5.4. 62 co CO LlJ CO —I C3 ct CO CO = o u s_ = 4- 4- z - c O • « •r— «f- x) ■M — — s C en o _sr .c r S- 4-> +j 4- *r" •^ •a s 3 c c fO •f"" Ol X) 4- — 3 r z Q. C 4- C 4- 4- -r- O = •r- O O _ 4-> ■o +J +-> T3 C s .c C C = o +-> o o i- .c S- s_ -o 4- +J E 4- 4- C •r- o rO c 2 s- C c •r - 4- •r— •r— — rO O u +-> o o> -t-> 4-> s- S- w o (1) c Q. Q. W O (/> Q. o. in Q. E i— i Ol +-> c cc a. cc cc O lu cc I cc Ol T3 -b -b XJ cn ~o ~a> o o ~CD ~ ra fO f0 rO XI XI rO ro -Q X) fO o LO CD ^- X) ro CJ3 cc \- co i^ CU o z: -o fO T3 CU o a O JD ~0) ro XJ « X) 63 5.3 Extended Suggestion Model 5.3.1 Using Nonterminal Information One of the objectives of an interactive diagnostic system is to provide for the programmer as much information about the state of the parse and the internal state of the compiler as is possible. The preliminary suggestion model presented earlier in this chapter (Figures 5.1 and 5.2) is able to provide much information about modifying individual tokens in the input string to correct the error. However, no information is available in the preliminary diagnostic model about higher-level language features such as "expressions," "subscripts," "statements," etc., even though the transition diagram parser system that is being used in the compiler provides and uses this information. In Chapter 3 of this thesis, the action of a transition diagram parser is described as making a state transition from a current state to some next state, depending on the input token and the labeled branches for the current state. A regular parser stack is used to allow the invoking of "sub" transition diagrams when specified on a branch; this operation corresponds to attempting to parse a "nonterminal" in the programming language, and these nonterminals typically represent higher-level language constructs in the language. > Therefore, for any state in the parser, it is possible 64 to determine not only what actual tokens are acceptable parsing options, but also what nonterminals , if any, are being sought. It is possible, for any state, to build a list containing all token options and all intermediate nonterminals that are acceptable to the parser. The algorithm for this is almost the same as the al- gorithm for the token option builder that was described in section 5.2.3; the only difference is that when a branch for a state refers to a nonterminal, then first that actual nonterminal is added to the list, and then the nonterminal's entry state is examined. This new list will be called the general option list . It is also possible to view the input string either as the string of actual tokens that have been input, or else as a string of tokens and nonterminals that have been parsed. Using the example E3 given in Figure 3.3, the input string "abed" can be viewed either as "a b c d" or as "a E3 d." To use these new ways of viewing the input string and the requirements of the parser for any state, it is necessary to change the activities of the MODIFY routine. In the preliminary version of MODIFY the idea was, for each token option in the token option list for a state, to create alternate interpretations of the input by applying the "A" and the "R" operations, and 65 subsequently the "S" operation, with respect to the next token in the input string. In the new version, we want to create alter- nate interpretations of the input by applying, for each general option in the general option list , the "A" and the "R" operations, and subsequently the "S" operation, with respect to both the next token and the next nonterminal (if any) in the input string. More formally, if the input string is x 3 a and the nonterminal B is B = x 3, then the input can be viewed as x 3 a or B a, for state Q. The "A" operation, for some general option , g, changes the input to A ( x3a)= gx ea 9 (Q) (Q) x 3 a, error_token_location then if KR = N then comment Reparsing was successful. Invoke routine USER INTERACTION, which will either return 76 with user_request set indicating the amount of detail the programmer wants, or else will cause the error system process to be terminated without returning normally. //USERJNTERACTION (user_request)// f_l else comment Extend reparse attempt 1 token if next token is acceptable. if transit (token (LR), QR) f "null" then //REPARSE (LR + 1, transit (token (LR), QR), KR, user request)// fi comment Make another modification at location LR, if possible, to build all legal interpretations, if KR < N then //MODIFY (LR, QR, KR)// fj_ fi. end REPARSE Figure 5.6 - Final REPARSE routine 77 begin MODIFY (L, S, K) comment L is the input location pointer (indexed from the beginning of the input string), Q is the parser state number for token at location L, K is the number of modifications that have been made prior to current activation. comment OPT routine constructs the general_option_tree. It is a tree structure, with descendents of a node repre- senting further refinements or detail for the node. //OPT: build general_option_tree for state Q with root 'root'// /Lnt *■ location in input following the nonterminal that begins with token at location L; if no nonterminal begins there, set Lnt «- 0/ comment Try Add, Replace operations for sons of root of tree. //M0D_TREE (L, Q, K, Lnt, root)// comment Prune special case for Suppress operation. if L f error_token_location then comment Token Suppress operation. //REPARSE (L + 1, Q, K + 1, last_try)// comment Does nonterminal exist? if Lnt f then comment Nonterminal Suppress operation. //REPARSE (Lnt, Q, K + 1, last try)// 78 f± end MODIFY Figure 5. 7 - Final MODIFY routine begin M0D_TREE (LD, QD, KD, LntD, Droot) comment LD is the input location pointer, QD is the parser state number for token at location LD, KD is number of previous modifications, LntD is location in input following nonterminal or if no nonterminal exists, Droot is the root of the (sub) tree in general_ option_tree. comment M0D_TREE tries Add and Replace (token and nonterminal) operations for each son (option) of Droot. comment REPARSE routine's 4th parameter is set to 1) 'unsuccessful' : if previous modification did not completely "correct" input string, or 2) 'successful-provide detail ' or 'successful -restrict detail' : if previous modification "corrected" input string, but programmer requested another suggestion. for all g e sons (Droot) loop : • 79 comment Add operation. //REPARSE (LD, transit (g, QD), KD + 1, last_try)// comment Is more detail requested? if last_try = 'successful -provide detail' then comment Give more detail: move depth-first through tree. //M0D_TREE (LD, QD, KD, LntD, g)// f_l comment Prune special case for Replace operation. if (LD f error_token_location or last_try = 'unsuccess- ful ' ) and last_try f 'successful-restrict detail' then comment Token Replace operation. //REPARSE (LD + 1, transit (g, QD), KD + 1 , last_try)// comment is more detail requested? if last_try = ' successful-provide detail' then comment Give more detail: move depth-first through tree. //M0D_TREE (LD, QD, KD, LntD, g)// fi_ comment Does nonterminal exist? 80 if LntD f and last_try f 'successful -restrict detail' then comment Nonterminal Replace operation. //REPARSE (LntD, transit (g, QD), KD + 1, lastjry)// comment Is more detail requested? if last_try = 'successful-provide detail' then comment Give more detail: move depth- first through tree. //M0D_TREE (LD, QD, KD, LntD, g)// 11 11 11 repeat end M0D_TREE Figure 5.8 - Final M0D_TREE routine 5.3.3 Example using the Extended Model As a comparative example of the error interpretations that can be determined using the Extended Diagnostic Model that has been described rather than the preliminary Token Model, consider again the transition diagrams E2 and E3 given in Figure 3.3. Table 5.5 shows the different general option trees that can be computed 81 for each state in the diagrams. Notice the nonterminal reference for state S2, followed by the more detailed token options for that state. STATE GENERAL OPTION TREES SI S2 53 "d" 54 "e" 55 "c" 56 ^ "f" - „ b „ 57 "g" 58 "c" 59 "d" S10 none Table 5.5 - General Options for States of Transition Diagrams E2 and E3 Table 5.6 shows a few unacceptable input strings for the diagrams E2 and E3, along with some possible error interpretations for those strings and a comparison indicator for the preliminary Token Model . 82 o CO CO CO UJ LiJ _I CD Q. CD s: =3 «=C co co = UJ >- _l CD CO UJ <=t Q CO Q O co z s: UJ Z3 s o 2: U- UJ c o s- co s- O) to c c o en j= j= = ■4-> -M •r- •«- T3 2 * S: CD U fO Q. CD O) u Q. = -o 4- = 4- -»-> o s~ CD to c *» Z • = CO 4- JO UJ O = — 4-> CD J3 m C -O O = JZ s- - J= 4-> 4- CO -»-> •r- UJ •C" 5 c - •r— s 3 C7t z O = C E O rO s- «d p— CD r~" Q. to Q. CD c CD a: t— 1 oc J3 • = 4- JD = O = cu +j 0) C -Q O JZ $- - -t-> 4- co •r— UJ 2 c - •r— - O) ■0 = c (O T" = -^ CD fO U +-> E > >» >> (0 O 1- C E CD c/1 S- to VI 1 — CD t- >> E •p- CO UJ •o c (O CM UJ to E as i- CJ> to c res s- o 1— 1— r- CVJ i— CM UJ - T3 -O CtL Q- ce: a: UJ T3 CO UJ 4- Xi 4- CO UJ ~o ~ CD $- Q. S- CD •P C s_ o t- s- UJ -a CD •o c CD +-> X UJ CD -Q «t5 -a o CO O J3 83 As another example of some of the suggestions that can be made using the extended suggestion model, consider transition dia- grams E6 through E9, shown in Figure 5.9. The acceptable input strings for these diagrams are: "a c b a e" "a c d e" "a d e" "b a c e b a" "b a c e d" "d c e b a" "d c e d" The general option trees that can be computed for each state in the diagrams are shown in Table 5.7. Note that for some of the states in this table, particularly states S16 and S17, the general option tree depends on not only the transition diagram itself, but also on the parser's stack at the time when the tree is constructed. Each of these states is a final state for one of the transition diagrams, and therefore it is necessary to know from where that diagram was invoked in order to build the tree properly. Table 5.8 shows a few unacceptable input strings for the diagrams E6 - E9, along with some possible error interpretations for those strings and some sample suggestion messages for these interpretations. 84 "*" ^ . Kl rS4Vl^ fs7^ " e " fSIO Rl ^S5 ^ " e " -MS> "e" S6 sn >- (si 2 S13 E8> S15 S17 - < S2 S3 S4 "e" "c" "b" "d" S5 S6 S7 S8, S9, SIO Sll SI 2 S13 S14 S15 S16, S17 "e" "e" none "c" "e" "b" "d" "a" "e" none "c" "b" "d" "b" "d" called from state SI : called from state S4. : called from state S6. : called from state Sll 86 STATE GENERAL OPTION TREES SI 8 "c" "d 1 S19 <^ : called from state S2, ^ "d" S20 "e" : called from state S2. Table 5.7 - General Options for Diagrams E6 - E9 87 CU A CT> CJ3 4- O 1^- C s O e UJ O • ■1" 4-> V i. -Q +J C 4- z e r O to s- -C C x S- a 4- +-> •r— 4-> 4- • ^ •r— •O E 3 z s e e •1— A •1— (O z CO z z U LU u z z (O = V s "O X O) O) •M •4-> O +J +-> S- X ■«-> 4- z z E 4-> E to O •^ O e JZ z J- 3 s- •r— +J 4- 4- •^ ■O z S 2 E E A C A »a •^ tr> •^ CO z LU LU +-> i. c o S- E LU O V $- = 4- T3 E E •1- fO «J ■M 4-> S- S- to LU o -p cu Q. s- cu +-> E S- o s- s- LU -o CU -o E (U +-> X CX D_ CX CX O LU CX I cx : o A 1^ LU V U A CO o ■o u to XI LU (0 V to beacon_location : comment Initiate reparse attempt from new location /back up 1 token in the input string/ /loc «- new input location/ /state <- corresponding state/ repeat 92 repeat comment No more help is available. /Report to user that no more help is available/ /reestablish the compiler's editing environment/ end ERR0R_SYSTEM_C0NTR0L Figure 5.10 - ERROR SYSTEM_C0NTR0L routine 93 Chapter 6 CONCLUSIONS AND FUTURE RESEARCH 6.1 Summary This thesis describes an interactive compiler diagnostic system. It is assumed that the compiler will be used by elementary programmers, and the responsibility of the diagnostic system is to behave like a "consultant" to these programmers with respect to syntactic errors. It is important that this consulting system be very easy to use and interact with so as to avoid confusing or frustrating the programmer. In addition, it is important that any diagnostic messages that are given to the programmer be very clearly stated using terminology with which the programmer is familiar. The diagnostic consulting system communicates with a programmer about an error by suggesting "possible, corrections" that can be made to the program to yield a valid syntactic prefix in the programming language. These suggestions are determined by examining the program input string, the parser's internal state information, and the parser's table; this implies a programming- language-independent diagnostic system. The only information re- quired by the diagnostic system in addition to this normal compiler information is an "English phrase" table which provides readable 94 English character strings that can be used to describe various pieces of parser information to the programmer. The system operates by successively moving farther and farther to the left of the position in the input string where the original error was detected; at each successive position, a modification to the input string is made (replacement, addition, or suppression ) andthe remainder of the input up to and including the original error token is reparsed. If no new errors are de- tected during this reparsing, then the modified input string is syntactically correct and a "possible correction" suggestion mes- sage can be given to the programmer that simply describes what modifications were made. If a new error is detected during the reparsing, then additional or different modifications must be made to the input string and the reparsing attempted again. Unlike previous systems, the modifications made to the input string are concerned not only with simple tokens, but also with any nonterminals that are acceptable at that position in the parse. These nonterminals generally represent higher-level con- structs in the programming language, and thus suggestion messages can be given using this higher-level terminology. A further advan- tage of these higher-level suggestions is that successive suggestions will frequently provide more and more detail about the particular nonterminal until the detail reaches the actual "token" level; thus, the suggestions tend to at first provide higher-level 95 information, followed by successively more refined and detailed information for the programmer. It is very easy, in addition, to provide the programmer with some control over the amount of detail that is presented; the programmer can, at any time, request that the successively refined suggestions not be presented, but that other, different suggestions be given instead (if possible). 6.2 Evaluation 6.2.1 Implementation Performance As mentioned in Chapter 1, one of the key objectives of this research was to create the first interactive diagnostic con- sulting system by actually implementing on the PLATO IV system the algorithms that have been described. The entire diagnostic system was written in TUTOR [She., '74], a Fortran-level language sup- ported by PLATO. TUTOR is a very good language for input/output: it contains very powerful facilities for accepting and examining keyboard input from the interactive terminal and for preparing arbitrarily complex graphics displays on the terminal's screen. However, it is fairly primitive with respect to program control constructs and memory management; in particular, it does not sup- port dynamic storage allocation, and therefore recursion is not available. Also the subroutine return-address stack is only ten levels high, a limit that is easy to exceed in a complex TUTOR 96 program. The implementation was completed, however, in approxi- mately 6 months, of which about 3 months were full-time (summer) and the remainder part-time. Because of the nonrecursive limitation of TUTOR, the actual prototype system was restricted somewhat from the algorithms described in Chapter 5. The most important restriction is that N, the number of modifications allowed in any interpretation, is limited to the value of N = 1 . This means that after the first modification has been made in a possible interpretation, the re- mainder of the input string must be acceptable without any changes for the reparse attempt to succeed. This removes the recursion requirement for the MODIFY routine immediately. It also allows the REPARSE routine to be written without recursion, since then its only responsibility is to simply check whether any new errors are signalled if the parser is started on the remainder of the input string. If no error is signalled, the reparsing succeeds; otherwise the single modification that was made did not correct the string and REPARSE must return to MODIFY so that a different modification can be tried. The only disappointment has been the fairly slow response time of the diagnostic system when another suggestion is requested. Since determining suggestions requires examining the parser's table and repeatedly reparsing the modified input string, the diagnostic system is generally compute-bound. Unfortunately, the PLATO IV 97 system is designed for input/output bound programs with time for relatively few calculations. The response time problem is not limited to the diagnostic system, however; the entire compiler system exceeds the currently suggested cpu usage limits of 2 - 4 ms/sec. The most recent timing estimates indicate that the overall compiler runs wery well at 7 - 8 ms/sec, and marginally well at 5 - 6 ms/sec. In compari- son, the diagnostic system runs very well at 10 - 12 ms/sec, and acceptably at about 8-10 ms/sec; the performance rapidly deteriorates below these levels, however, and subjective tests in- dicate that the diagnostic system is too frustrating and slow to be useful with a cpu limit below 5-6 ms/sec. An important observation about the operation of the diag- nostic system is that the initial suggestion messages that are given by the system are generally quite easy to determine and com- pute since \/ery little context is examined (i.e., only the actual token that originally caused the error to be signalled). There- fore, the first few "possible correction" messages can be computed for the programmer with a small cpu limit (around 4-5 ms/sec), so that at least some information is available even when the PLATO IV system is fairly heavily loaded. It is mostly when suggestions concerning modifications farther back in the input string must be computed and PLATO IV is heavily loaded that intolerable response delays may occur. 98 6.2.2 Effectiveness At the present time, because of the relatively slow response times mentioned above, the diagnostic system has not been used by elementary programmers; therefore it is impossible to report on the actual demonstrated effectiveness of the system. It is possible, however, to compare a few of the general charac- teristics of the consulting system with other diagnostic systems. The diagnostic systems for the PLATO IV compilers were originally designed as a set of ad hoe, hand-coded, one-line error messages, determined by a specific error number that was obtained from the parser's table. One specific error message was displayed upon detection of an error in the user's program, and this message usually consisted of describing the most likely token (in the language implementor's opinion) that would be acceptable at that point in the parse. The diagnostic system proposed in this thesis is language independent (it does not use any special error numbers, but determines messages by examining the productions in the parser's table), and the first few messages to the programmer always men- tion the tokens or nonterminals that could be used immediately at the location of the error in the program. These initial messages therefore will usually provide not only the same information that was available using the hand-coded technique, but actually all of the correction possibilities. Since these first few messages are 99 easy to compute, as mentioned previously, the consulting diag- nostic system seems to provide at least as effective (frequently more effective) assistance as did the ad hoo systems. To determine how many immediate correction possibilities there are for a particular error situation, it is necessary to examine the general option tree for the state in which the parser detected the error. A study of the general option trees for vari- ous parser states was made to discover some of the characteristics of these trees. The small subset of PL/I that has been implemented (assignment, DECLARE, GOTO, IF, DO, list I/O, CALL, and RETURN statements) was used for the study. The detailed results are given in Appendix I. It was found that about 65 percent of the trees for the possible parser states contained only one or two option nodes. About 25 percent of the trees contained nonterminal option nodes, and the maximum height of the trees was four (indi- cating up to three nonterminal option nodes in a tree). Roughly 65 percent of the trees had only one or two token option nodes; however, there were also a number of states with eight to ten token options (primarily states accepting numeric expressions) and eighteen to twenty token options (for the beginning of a new state- ment). In interpreting this information, it is important to remember that the statistics are a measure of the trees for most of the possible parser states that can occur in the PL/I subset. 100 However, in any actual PL/I program, some of the possible states occur much more frequently than others. In particular, states corresponding to the start of a new statement or states accepting numeric expressions occur frequently. Therefore, the trees for these frequently-encountered states will be used most often. Since the trees for statements and expressions are among the largest and highest for PL/I, the consulting system also has the most in- formation available for these states. If the consulting system is compared to the system of Levy [Lev, '71], it is interesting to note that near the original error token the consulting system can provide outstanding sugges- tions. Not only are simple "token" modifications suggested (as is done by Levy), but also appropriate nonterminal modifications are given. Furthermore, these nonterminal messages are then successive- ly refined down to the actual token level, with the programmer having some control over this refinement process. These suggestions seem to provide much better assistance to a programmer than just the simple token messages provided by Levy. As suggestions are determined about modifications farther back in the input string, however, the two systems appear to pro- duce quite similar suggestions. These suggestions concern only simple token modifications; the consulting system is seldom able to make suggestions involving nonterminals farther back in the input. This can be explained by realizing that acceptable strings in a 101 programming language are usually syntactically "far apart" or dissimilar. If a nonterminal is recognized in the parsing of a string, there is a high probability that that nonterminal was in- tended by the programmer and that it is formed and located cor- rectly. Therefore, if an error is encountered to the right of a recognized nonterminal, it will seldom be possible to modify that nonterminal and thereby obtain a legal interpretation. However, modifying a single token before the nonterminal may cause the tokens comprising the original nonterminal to be reparsed differ- ently, yielding a legal interpretation. 6.3 Future Research There are a number of areas that can be suggested for future research. One project is to continue to work on and opti- mize the TUTOR-coded implementation on PLATO IV. The implemented system would be much more useful if it were a factor of two or three times more efficient. Work is already underway by the PLATO systems staff to modify some of the TUTOR commands to allow a much more efficient implementation. There is no doubt, also, that the current program code could be improved substantially. Another area of research that definitely could be ap- proached is to actually test the usefulness and effectiveness of the diagnostic system with programmers. The system could be statistically compared with other existing systems to determine the 102 relative effectiveness of the systems for elementary programmers. It would also be yery interesting to determine the usefulness of the consulting system for advanced programmers. A final research suggestion is to look at the problem of explaining the effect of a "possible correction" suggestion on the program. The proposed system is very good at describing different ways to correct the program, but it relies on the programmer to examine the suggestions and determine what the ramifications of making a particular correction will be on the program, Further work on the characterization and description of context-sensitive semantic requirements would also be very useful in attempting to explain the effect of a particular correction on a program. 6.4 Conclusion The use of interactive computing systems is steadily increasing in the computing world. It is anticipated that inter- active, incremental compilers will become standard features of future systems. This thesis has described one possible approach to the handling of diagnostic assistance in this environment. It is hoped that this work will help stimulate research in this area. 103 LIST OF REFERENCES [Aho., '72] : Aho, A. V. and Ullman, J. D. , The Theory of Parsing Translation, and Compiling , Volume 1, Prentice-Hall, Inc., 542 pp., 1972. [Alp., '70] : Alpert, D. and Bitzer, D., "Advances in Computer Based Education," Science , Vol. 167 (20 March, 1970), pp. 1582-1590. [Con., '63] : Conway, M. , "Design of a Separable Transition-Diagram Compiler," CACM, Vol. 6 (July 1963), pp. 396-408. [Con., '73] : Conway, M. , and Wilcox, T., "Design and Implementa- tion of a Diagnostic Compiler for PL/I," CACM, Vol. 16, No. 3 (March 1973), pp. 169-179. [Cre., '70] : Cress, P., Dirksen, P., and Graham, J., Fortran IV with Watfor and Watfiv , Prentice-Hall, Inc., 1970. [Dav., '75] : Davis, A., An Interactive Analysis System For Execution-time Errors , Ph.D thesis, Department of Computer Science, University of Illinois, January 1975, Report # UIUCDCS-R-75-695. [DeR., '71] : DeRemer, F. L., "Simple LR(k) Grammars," CACM, Vol. 14, No. 7 (July 1971), pp. 453 - 460. [Ela., '75] : Eland, D. , Forthcoming Ph.D thesis on the GUIDE information-retrieval system, to be published summer 1975. [Flo., '63] : Floyd, R. W. , "Syntactic Analysis and Operator Precedence," JACM 10 (July 1963), pp. 316 - 333. [Gra., '73] : Graham, S. and Rhodes, S., "Practical Syntactic Error Recovery in Compilers," Conference Record of the ACM Symposium on the Principles of Programming Languages, Boston, Mass., October 1973, pp. 52-58. [Gri., '74] : Griffiths, M. , "LL(1) Grammars and Analyzers," Compiler Construction , Bauer, F. and Eickel, J. (eds), Springer-Verlag, New York, 1974, pp. 57-83. [Iro., '63] : Irons, E. , "An Error Correcting Parse Algorithm," CACM, Vol. 6 (Nov. 1963) pp. 669-673. 104 [JaE., '73] : James, E. , and Partridge, D., "Adaptive Correction of Program Statements," CACM, Vol. 16, No. 1 (Jan. 1973), pp. 27-37. [JaL. , '72] : James, L., A Syntax-directed Error Recovery Method , Masters thesis, Computer Systems Research Group, University of Toronto, May 1972, # CSRG-13. [LaF., '70] : LaFrance, J., "Optimization of Error Recovery in Syntax-directed Parsing Algorithms," Sigplan Notices 5 (Dec. 1970) pp. 2-17. [LaF., '71] : LaFrance, J., Syntax-directed Error Recovery for Compilers , Ph.D thesis, Department of Computer Science, University of Illinois, 1971, # 459. [Lei., '70] : Leinius, R., Error Detection and Recovery for Syntax Directed Compiler Systems , Ph.D thesis, Computer Science Department, University of Wisconsin, 1970. [Lev., '71] : Levy, J., Automatic Correction of Syntax Errors in Programming Languages , Ph.D thesis, Computer Science Department, Cornell University, Dec. 1971, Technical Report # TR 71-116. [Lorn., '73] : Lomet, D. , "A Formalization of Transition-diagram Systems," JACM, Vol. 20, No. 2, (April 1973), pp. 235-257. [Lyo., '74] : Lyon, G., "Syntax-directed Least-errors Analysis for Context-free Languages: A practical Approach," CACM, Vol. 17, No. 1, (January 1974), pp. 3-14. [Nie., '74] : Nievergelt, J., Reingold, E., and Wilcox, T., "The Automation of Introductory Computer Science Courses,' A. Gunther, et al . (eds), International Computing Symposium 1973, North-Holland Publishing Co., 1974. [Rho., '73] : Rhodes, S., Practical Syntactic Error Recovery for Programming Languages , Ph.D thesis, Department of Computer Science, University of California at Berkeley, June 1973, Technical Report 15. [She., '74] : Sherwood, B., The TUTOR Language , Computer-based Education Research Laboratory and Department of Physics, University of Illinois, Urbana, Illinois, 1974. 105 [Tin., '75] : Tindall, M. , An Interactive Table-driven Parser System , Masters thesis, Department of Computer Science, University of Illinois, August 1975. [Wil., '73] : Wilcox, T. , "The Interactive Compiler as a Consultant in the Computer Aided Instruction of Programming," Proceedings of the Seventh Annual Princeton Confer- ence in Information Sciences and Systems, March 1973. [Wir., '66] : Wirth, N. , and Weber, H., "EULER-A Generalization of ALGOL and its Formal Definition," CACM, Vol. 9, No. 1 (January 1966) pp. 13 - 25, and Vol. 9, No. 2 (February 1966) pp. 89 - 99. 106 APPENDIX I GENERAL OPTION TREE EXAMPLES 107 This appendix contains the results of a study of the size and shape of the general option tree for many of the possible pars- ing states. The analysis was performed on the small subset of the PL/I language that has been implemented on the PLATO IV system. In each case, the information is presented as a particu- lar input string and the corresponding option tree that is accept- able at that point in the parse of that string. Obviously, only some of the possible input strings could be examined, but a repre- sentative sample was selected and analyzed. In all, a total of 61 possible input strings were examined. To simplify the form of the option trees, two special subtrees will be used: one for the start of a numeric expression and the other for the start of a new statement. Whenever an expres- sion or a statement is specified as a nonterminal in an option tree, the respective special subtree will be indicated. 108 Special Subtrees Numeric Expression = expr ID Numeric Array + simple, Constant declared, declared, numeric, numeric. SIN . . . LOG LENGTH . . . INDEX 109 Statement = Stmt ID numeric, or character. DECLARE DCL GO GOTO IF STOP DO PUT GET CALL ENTRY RETURN no INPUT OPTION TREE © ID undeclared TESTIO TESTIO: © PROCEDURE PROC T : PROC ( OPTIONS RETURNS T : PROC ( ID undeclared Ill INPUT OPTION TREE T : PROC (A, T : PROC (A, © ID undeclared T : PROC (A) RETURNS T : PROC OPTIONS, T : PROC OPTIONS ( MAIN T : PROC OPTIONS (MAIN, 112 INPUT T : PROC OPTIONS (MAIN) A OPTION TREE T : PROC RETURNS , T : PROC RETURNS ( FIXED FLOAT CHARACTER CHAR T : PROC RETURNS (CHAR) T : PROC ; END Stmt 113 INPUT OPTION TREE ASSIGNMENT STATEMENTS ID numeric, or a character © ID numeric A ID character A ID String Array "SUBSTR" ( char Constant char 114 INPUT OPTION TREE DCL A(10); A A DCL A(10); A (5) A ID = 10, numeric binary operator LABELS ID label . ref , or A undeclared 115 INPUT OPTION TREE DECLARE STATEMENTS DCL DCL ( undeclared undeclared DCL (A, FIXED CHARACTER CHAR 116 INPUT OPTION TREE DCL (A( array bound expr DCL (A) FIXED FLOAT CHARACTER CHAR 117 INPUT OPTION TREE GO Statements GO GOTO TO ID ID ID undeclared Label label -reference GOTO LI 118 INPUT IF Statements OPTION TREE IF, IF I = 1 binary operator conditional operator THEN IF I = 1 THEN 119 INPUT OPTION TREE IF I = 1 THEN STOP: A STOP Statement ELSE END STOP RETURN Statement RETURN RETURN ( 120 INPUT OPTION TREE DO Statement DO, DO I ID Array WHILE numeric numeric DO I = DO I = 1 A \ binary operator TO BY WHILE 121 INPUT DO WHILE, OPTION TREE © DO WHILE ( PUT Statement PUT, PUT SKIP, SKIP PAGE LIST 122 INPUT OPTION TREE PUT PAGE, LIST PUT LIST A PUT LIST ( Array ID Constant Array numeric, or numeric, numeric, or character. or char char \. ( Numeric Built-in Functio SIN LOG INDEX . SUBSTR 123 INPUT OPTION TREE PUT LIST (I binary operator PUT LIST (I) T GET Statement GET LIST GET LIST GET LIST ( GET LIST (I ID Array 124 INPUT CALL Statement OPTION TREE CALL ID undeclared ID external CALL B. CO CALL B ( G \ expr CALL B (I / \ \ CALL B (I) binary operator i.O 125 APPENDIX II EXAMPLES OF "POSSIBLE CORRECTION" SUGGESTIONS FOR PL/I 126 This appendix contains a number of examples of the "possible correction" suggestions that are generated by the diag- nostic system that has been implemented on PLATO IV. All of the examples are taken from the subset of the PL/I language that is in use by the interactive computer. Each example is presented as a number of consecutive diagrams or "frames." The person using the diagnostic system would actually see only one frame at a time; each successive frame would be displayed as the user pressed the appropriate "HELP" or "Shifted-HELP" Key. The different frames for a par- ticular error situation are presented on the same page in this appendix only as a convenience to the reader. 127 FILE « workspace PL/ 1 WORKSPACE (14- new) SPftCE = 284 ■' ,l • ■ r r r w r r r w r ' r w " r ■ r t ' k w r r r* .ft: PPOC; DCL.I .FIXED. . ********* *ERROR** ******** (back) or (ehhse) to fix. This undeclared variable is not permitted here. Press (help) for more information. FILE = workspace PL/ 1 UIORKSPftCE (1 4 -new) SPftCE = 284 v r f r v m r r y r v r v y r r v " r r» A: .... . .PROC; DCL.I .FIXED. . | ********** Possible Correction ********** (back) or (erhse) to fi>. Replace | | with an attribute. Press (help) for a different suggestion. Press GED^IkP) to see a legal attribute. FILE = workspace PL/ 1 UORKSPftCE (14-new) SPftCE = 28 4 11 , 1 v v v r v . r r • v ? w •' r • r r r v " r r Ei .ft: PROC; DCL.I .FIXED. . § ********** Possible Correction ********** (back) or (cruse) to fix. RepUce J~3~n with_an attribute "DECIMALS _ Press (help) for a different suggestion. Press tima iHELP) to see another legal attribute. 128 FILE = workspace PL/ 1 WORKSPACE (1 4- new) SPACE = 28 4 ' y ' * r'r. y r » r ' v r ' r t * " r v r 9 r ' -y A: PROC; DCL.I .FIXED. . .g ♦»♦♦»♦♦»* ♦ Pos sible Correction ********** (Sack) or (Ebbs?) to fix. Replace | 1 with an attribute "DEC". Press (hIIp) for a different suggestion." FILE = workspace PL/ 1 WORKSPACE (14- new) SPACE = 284 r • r r w r . r r v ' r r p ' v v v r v r r> .A: PROC; DCL.I .FIXED. . @ ********** Possible Correction ********** (back) or [erbse) to fix. Insert "," in front of | [ . Press (help) for a d i f f erent suggest i on . FILE = workspace PL/ 1 WORKSPACE (14- new) SPACE = 284 1 ' " r v r r ' w v ' r r r r r r v r v v r r 1 .A: PROC; DCL.I .FIXED. . g ********** Possible Correction ********** (back) or [erhse] to fix. Insert ";" in lront of 1 | . Press (help) for a different suggestion. 129 FILE = workspace PL/I WORKSPACE (14-new) SPACE * 28 4 r r f r r.r r w r t r ' r ' » r ' r v r ■ r " ' " r 1 A: PROC; DCL.I . riXECj . . .C ********** Possible Correction ********** (back) or [erase] to fix. RepUce_l_^_J with_\ ^ Press (help) for a different suggestion. 130 FILE = workspace PL/ 1 WORKSPACE (14- new) SPACE = 263 v v r r r ; w v r w v r v v v v v r r ' 1 B: PROC; DCL.I, . A(. .Q ********* *ERROR#********# (back) or (errse) to fi'x. This punctuation symbol is not permitted here. Press (help) for more information. FILE = workspace PL/ 1 WORKSPACE (14- new) SPACE = 263 6: PROC; DCL.I, . .A(. .0 ********** Possible Correction ********** (sack) or ( erase) to fix, Insert an array bound in* front of Press (help) for a different suggestion. Press SQB^IkP) to examine an array bound, FILE = workspace PL/ 1 WORKSPACE (14- new) SPACE =■ 283 " " ' ' ' r r r r ' y, v v v r r r r ' r r v - r r r ' i B: PROC; DCL.I, . A(. .Q ********** Possible Correction ********** (back) or (erase) to fix. Insert a numeric expression in front of . Press (help) for a different suggestion. Press h:h^ help) to examine a numeric express /on. 131 FILE « workspace PL/I WORKSPfiCE U 4 -' new) SPACE » 283 • ' w r r r 1 "r. r rr r r r ' r r r r rr i> » B: PROC; DCL.I, . .fl(. .0 ********** Possible Correction ********** (back) or jjjjJE) to fix. Insert a numeric operand* in front of | | . Press (help) for a d i f f erent suggest i on . Press mmiHELPl to examine a numeric operand* FILE = wo rkspace PL/I UIORKSPRCE (1 4-r.ew) SPACE = 26 3 ' " v r v v • v r r '*> w v ' v ' r y r y r r ri 6: PROC; DCL.I, . .A(. .0 ********** Possible Correction ********** (back) or (cruse) to fix. Insert a declared variable ( numeric) in front of | | . Press (help) for a different suggestion. FILE = workspace PL/I WORKSPACE (14-new) SPRCE = 283 9 9 9 v ' v . w r v r v v v r y v v r ' r J=x= y B: PROC; DCL.I, . .A(. .0 ********** Possible Correction ********** (back) or (erhse) to fjx. Insert a numeric constant in front of | 1 . Press ^Kp) for a different suggestion. FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPRCE = 28 3 r r r r ' y r r ■ r r r ^ ' m - r r r r r r i B: PROC; DCL.I, . .R(. .Q ********** Possible Correction ********** (back) or (erbseJ to fix. Replace | | with a declared array ( numeric) . Press (help) for a d i f f erent suggest i on . 132 FILE = workspace PL,' I WORKSPACE (1 4 -new) SPACE = 283 B: PROC; DCL.I, . .AC .0 *♦♦»♦♦»*» » Pos s i b 1 e Correction ********** (back) or (erbse) to fix. Replace | | with an arithmetic operator. Press (help) for a different suggestion. Press GUBhelp) to see a legal arithmetic operator. FILE = workspace PL/I WORKSPACE (1 4 -new) SPACE « 283 r v r r " v , r r v r w r v v r~ v ~ r r r » B: PROC; DCL.I, . .AC .0 ********** Possible Correction ********** (back) or (erbse) to fj*. Replace | | with an arithmetic operator " + ". Press (help) for a d i f f erent suggest i on . Press ffiiffiHELPl to see another legal arithmetic operator. FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPACE » 283 B: PROC; DCL.I, . AC .0 ********** Possible Correction ********** (back) or (errse) to fi>. Replace | | with an ar it hmet ic operator "-". Press (help) for a different suggestion. 133 FILE ■ workspace PL/ 1 WORKSPACE (14- new) SPRCE - 26 3 ■■■'■ > r w ' ' r 11 ] r ' r ' r ' " r - t r " ' r - ' r ■ l ^^^ M *g^™T F r " ' r * " r* B: PROC; DCL.I, . R(. .0 »»»♦**♦»» » Pos sible Correction ********** [back] or (cruse) to fix. Replace j | with " (". • Press (help) for a different suggestion. FILE = workspace PL/ 1 WORKSPACE (14- new) SPRCE = 283 ' r r 9 r r r r " r ' r r ' r ' ' r r y r ' r v r i B: PROC; DCL.I, . .R(. .0 ********** Possible Correction ********** (back) or (eriise) to fix. Replace [ j with a numeric built-in function. Press (help) for a d i f f erent suggest i on . Press fliHiiHELPl to see a legal numeric built/ in function. FILE = workspace PL/ 1 WORKSPACE ( 1 4 - ■new) SPRCE = 283 v v w v ?. * r r y Y r " r v v w • r r w m t B: PROC; DCL.I, . R(. .0 ********** Possible Correction ********** [back ) or (erhse) to fix. Replace | | with a numeric built-in function "RBS". Press (help) for a d i f f erent suggest i on . Press Q2Ete§LP) to see another legal numeric/built-in function. 134 FILE = workspace PL/ 1 WORKSPACE (1 4-new) 5PRCE » 283 1 - ' 7 r t r v . w • r ' r r ' r ' r "' r r r T r r ^ " i B: PROC; DCL.I, . R(. .0 ********** Possible Correction ********** (back) or [erhse) to fix. Replace | | with a string built-in function. Press (help) for a different suggestion. Press GEBtHkEl to see a legal string built-rn function. FILE = workspace PL/ 1 WORKSPACE (1 4 - new) SPACE = 28 3 • v '- r r — w ' v v r ' w v r v ' v v ' v r v r r j B: PROC; DCL.I, . R( Q ********** Fossible Correction ********** (back) or (erhse) to fix. Replace | | with a string built-in function "LENGTH". Press (help) for a different suggestion. Press HiHiiHELPl to see another legal string built-in function. 135 FILE = workspace PL/ 1 WORKSPACE (14-nem) SPACE * 26 2 ' ' r r r r r j r ^'^T r r r"7 r r • i r r r p 1 i C: PROC; DCL.I, .A (10) DEC; GET. LIST ( A ) ; I.-.R.H ***#******ERROR* ********* (ffick) or (erme) to fix. This punctuation symbol is not permitted her.e. Press (help) for more information. FILE = workspace PL/I WORKSPACE (1 4-' new) SPACE = 2 62 f ' y ' r » 9 9 * > 9 > » > » r r r F C: PROC; DCL.I, A (10) DEC; GET. LIST (.A ) ; I. -.fi.fi] ********** Possible Correction ********** (gfiCkl or (erhse] to fix. Insert a subscript list in front of Press IhIlp) for a different suggestion. Press GEBhIlp) to examine a subscript list. FILE = workspace PL/ 1 WORKSPACE (14- new) SPACE » 262 J * =ac== ■ ■ ■ ( r i r r r r } r ' r r r r ' t r ' r r " v • r v v i C: PROC; DCL.I, .A (10) DEC; GET. LIST (A.) ; I.-.R.0 ********** Possible Correction ********** [back) or (Ewiisi) to fix. Replace | | with " (". Press JHELP) for a d i f f erent suggest i on . 136 FILE - w orkspace PL/ 1 WORKSPACE (14- new) SPACE = 262 y " * * r . * r " ' r r " r r ' y » r r r r — *— i C: PROC; DCL.I, .A (10) DEC; GET. LIST ( A ) ; i.-.g.; *♦»»»»♦♦* * Pos sible Correction ********** (brck) or (t*m\ to fix. Replace | | with a numeric expression. Press Ihjelp) for a different suggestion. FILE « workspace PL/ 1 WORKSPACE (1 4 -new) SPACE = 262 v v v w ' 7 , v r ' V v — v ' r v v y v v r r i C: PROC; DCL.I, .A (10) .DEC; GET. LIST ( A.) ; I. -a..s ********** Possible Correction ********** (buck) or (raisi) to fix. Replace [ 1 with a numeric operand. Press (help] for a different suggestion. FILE = workspace PL/I WORKSPACE (M -new) SPACE » 262 " • ' v v v r ' r / v r r r v r v ' r ' ' " r ' r y r ' r' y C: PROC; DCL.I, A (10) DEC; GET. LIST ( A ) ; I.-.0.; ********** Possible Correction ********** (back) or (jlhrse) to fix. Replace with a declared variable ( numeric) Press ( help) for a dif f erent suggest i on . 137 FILE ■ workspace PL/ 1 UJORKSPRCE (1 4 -new) SPRCE = 26 2 ^— = r v r r ' r , r " r " r " ^' r w w • r v y ' r r " r *"l C: PROC; DCL.I, .A (10) DEC; GET. LIST (R.) ; I.-.0.J ********** Possible Correction ********** fgcjj) or (crwj?) to fix. Replace | | with a numeric constant. 138 FILE = wor kspace PL/I WORKS PACE (1 4 -new) SPACE = 264 .0: PROC; DCL.I, ,A(10) ; GET LIST. (A) ; I-.ft(.fl **********ERROR********** ©or@to fix. This punctuation symbol _is not_ permitted here. Press (help) for more information. FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPACE = 26 4 r r v r v ; r r 'v v v r v v r v v r r — > .D: PROC; DCL.I, A(10) ; GET LIST. (A) ; I-.AC.0 ********** Possible Correction ********** (Back) or (JRJse) to fix, Insert a numeric expression in front of •. Press (help) for a different suggestion. Press l-Ii^ HELP j to examine a numeric expression. 139 FILE - workspace PL/I WORKSPACE ( 1 4 -' new) SPfiCE -26 4 r f r r r/ ' r 1 r ' V w ' w " w ' w ' r r r " ' r r r » D: PROC; DCL.I, R(10) ; GET. LIST. (R) ; I-.RC.0 ********** Possible Correction ******##*# [back] or (ermeI to fix. Insert a numeric operand in front of | | . Press (help) for a d i f f erent suggest i on . Press SISJ(helpJ to examine a numeric operand. 140 FILE = workspace PL/ 1 WORKSPACE (14- new) SPftCE = 262 '"' v y y v 7 , v v v v v v ' r y v v y y r =3== i E: PROC; DCL.I, .ft (10) ; GET. LIST, (ft) ; I-.AC.i . .0 ******* ***ERROR* w ******** (gflCK) or IIrhse) to fix. This punctuation symbol is not permitted here. Press [ help) for more i n format i on . FILE = workspace PL/ 1 WORKSPACE (M-new) SPftCE = 26 2 ' v y y y r r r r r r " r r r ' y v y r r ' i .E: PROC; DCL.I, .ft (10) ; GET. LIST, (ft) ; I=.ft(.l .0 ********** Possible Correction ********** [back] or (erjjse) to fix. Replace | | with an arithmetic operator. Press (help) for a different suggestion. 141 FILE « workspace PL/ 1 WORKSPACE (1 4 -new) 5PRCE » 262 ' r r r r r ^ r r ,m, w "" r " ' r m r ' ' r "T ' r ' r r r " " i .E: PROC; DCL.I, A (10) ; GET. LIST. (A) ; I-.RC.l . .0 ********** Possible Correction ********** (back) or Ictme) to fix. Insert ") M in front of I I. 142 FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPACE * 25 6 ■ i n y y r v v v r w w r y v — v v r v r r r F: PROC; DCL.I, A( 5, 10) .; GET. LIST. (A) ; I=.A. (.1 .0 ******* ♦♦•ERROR* ********* (sock) or (erhseI to fix. This punctuation symbol is not permitted here._ Press (help) for more i n format i on . FILE = workspace PL/ 1 WORKSPACE ( 1 4 -new) SPACE * 258 1 1 r v w r r . \r w r w w 7 v r v r v r v v F: PROC; DCL.I, .A (.5, 10) .; GET. LIST. (A) ; i-.n. (.1 .Q ********** Possible Correction ********** (back) or (ERJjffl to fix. Replace | [ with an arithmetic operator. Press (help] for a different suggestion. FILE = workspace PL/ 1 WORKSPACE (M-new) SPACE = 25 8 F: PROC; DCL.I, A( 5, 10) .; GET. LIST. (A) ; I* .A. (.1 .Q ********** Possible Correction ********** (back ) or (erSse) to fix. Replace | _ j with_" , " . Press (help) for a different suggestion. 143 FILE = uiorkspace PL/ 1 WORKSPACE (1 4-neui) SPftCE = 25 6 •r r r • r w m r r m " r [l r " r r ' r ' r ■ r ■ y ■ r r' r i F: PROC; DCL.I, .A (.5, 10) . ; GET. LIST, (ft) ; I-.ft. (.0.) ********** Possible Correction ********** [back] or (raws?) to fix. Insert " (" in front of PH - 144 FILE - workspace PL/I WORKSPACE ( 1 4 - new) SPACE = 25 4 G: PROC; DCL.AU0) ,D(10, 10) ,C; DO. ,C=1 TO. 10; fi(C) .- D.g **********ERROR*v ******** (BACkj or (ERH5?) to fix. Th i s arit rime t i c oper a t or is not perm i 1 1 ed her e . Press Ihelp) for more information. FILE - workspace PL/ 1 WORKSPACE (1 4 -new) SPACE = 254 G: PPOC; DCL.AO0) ,D(10, 10) ,C; DO. C=l TO. 10; fl(C).vD.g ********** Possible Correction ********** [back) or (rail!) to fix, Insert a subscript list in front of Press I.help) for a dif f erent suggest i on . Press PTOji hELPl to examine a subscript list. FILE = workspace PL/ I WORKSPACE (1 4 -new) SPACE = 25 4 7 r r v v r v r v ' v v v v v v v r 'nP G: PROC; DCL.A(10) ,D(10, 10) ,C; DO. C=l TO. 10; A(C) =.D0 ********** Possible Correction ********** (back) or (jreili) to fix. RepUce j _ j with " (". Press j help) for a different suggestion. FILE = ujoHcspacg__ PL/I WORKSPACE (1 4- ne w) SPACE = 25 4 G: PROC; DCL.A(10) ,D(10, 10) ,C; DO. C=l TO. 10; A(C) .= |.* ♦♦»»»»*** » Pos sible Correction ********** {mck} or {Erhse) to fix. R f p I a £ e J— -J with a numeric operand Press IHELP) for a different suggestion. 145 FILE - workspace PL/ 1 WORKSPACE (1 4- new) SPACE ~ 254 - -■ ■ r r r ' r . ' r r " r " ' v f " r* ' r r > r ■ r r~ r — > G: PROC; DCL.AU0) ,D(10, 10) ,C; DO. C=l TO. 10; A(C) .-.§.* ****»****» Possible Correction ********** [back) or (erase) to fix. Replace | J with a declared variable ( numeric) . Press [help] for a d i f f erent suggest i on . FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPACE » 25 4 " ' " "" ' ' r r r ' r t v w ' '> " f ' r " 9' f ' r ■ r v v r - r " r " G: PROC; DCL.AU0) ,D(10, 10) ,C; DO. C=l TO. 10; A(C) .-.§.* ********** Possible Correction ********** [back] or (erase) to fix. Replace | | with a numeric constant. 146 FILE = workspace PL/ 1 WORKSPACE (1 4- new) SPACE = 2 76 v r r v w , v ' r r r y y r y r ? i > r r"™ 35 ! H: PPOC; DCL.J.CHRR(l) ; DO..0 **********ERROR***#****** (back) or (erhIe) to fix, This declared variable is not permitted here. Press I .help) for more information. FILE = workspace PL/ I WORKSPACE (14- new) SPACE = 276 7 v r v r~ v — r F' y'r v v v v — v v v • r i H: PPOC; DCL.J CHRR(l) ; DO.. a ********* * Pos s i b 1 e Correction »*****#**# (brckJ or [erase] to fix. Replace | | with a declared variable ( numeric) . Press [help) for a different suggestion. FILE = workspace PL/ 1 WORKSPACE ( 1 4 -new) SPACE = 27 8 .H: PROC; DCL.J.CHAR(l) ; DO.. | ********** Possible Correction ********** (back) or [erhse] to fix. Replace [ J with a declared array ( numeric) . Press (help) for a d i f f erent suggest i on . FILE = workspace PL/I WORKSPACE (1 4 -new) SPRCE = 278 r ' r H: PROC; DCL.J.CHRR(l) ; DO.. a ********** Possible Correction ********** (back) or [erase] to fix, Replace with "WHILE" Press I .help) for a d i f f erent suggest i on . 147 FILE « workspace PL/ 1 WORKSPACE (1 4 -new) SPACE - 278 »^— — " r r r r r \ r 'r ' r ' r rr'r r r~ r r r r r .H: PROC; DCL.J.CHAR(l) ; DO . .0 ********** Possible Correction ********** (back) or [eruse] to fix. Insert ";" in front of | | . Press [help) for a d i f f erent suggest i on . FILE* workspace PL/I WORKSPACE (M-new) SPACE ■ 278 •r r f r ,, f /r r , r ' , r w • w r r ' r ' r F r r ~~r .H: PROC; DCL.J.CHAR(l) ; J ********** Possible Correction ********** [back] or (erhse) to fi>. Replace | | with a statement. 148 FTI F worksMC- PL/ 1 WOR KSPACE (M-neuO SPftCE = 26 7 h ILL - W'..<1 f-apa^- ' .1: PROC; DCL.fi, B; . GET, LIST ( ft, B) ; IF. .R+B. .[THEM **********ERROR********** ® or S to f1 *' This reserved word i s jiot jperm i t_ted _bere . _ Press I help ) for more i n for mat i on . FILE . ■.■^^ce PL/I UORKSPflG E (14-reui) SPfiCE - 267 I: PROC; . .DCL.ft,B; GET .LIST ( ft, B) ; IF ft+B.filHENl ********** Possible Correction ********** @ ^ ® to fl *' Replace F~1 with an ar^thmet^c_operator. ~Press~|H^|~ f or a different suggestion. PL/I ■■■r^cpftrF f 1 4-r,ew) SP*CE=_267_ -v r I: PROC; DCL.fi.B; GET LIST (ft, B) ; IF ft+B . [THEM d «ihl* Correction ********** (bacE> or I^D to fix. ********* Possible Lorrecxwn Replace \ZJ wl l h _ a relational operator ._ _ _ "Press'l^p) f^r a different suggestion. 149 FILE ■= workspace PL/I WORKSPACE (1 4 -new) SPftCE - 26 7 ^— ^^■-■- ■ r r r r ' r " r ■ r ■■ ■ f ■ p r - ■ r - ?■■■?■ ■ r •■■■ r " r t" ■' r 1 ' 'i I: PROC; DCL.fl.B; GET. LIST (fl,B) ; IF. .ft££. THEN ********** Possible Correction ********** (Sack) or (wise) to fix. Insert a relational operator in front of | | . Press (Hltp) for a different suggestion. FILE * workspace PL/I WORKSPACE (1 4- new) SPfiCE = 267 .1: PROC; DCL.A,B; GET. LIST (R,B) ; IF. .g+§. .THEN ********** Possible Correction ********** (back) or (twisE) to fix. Replace | | with a conditional expression. 150 FILE = workspace PL/ 1 WORKSPACE (14 -new) SFRCE = 275 J: PROC; DCL.I; DO. 1 = 1 .TO. 10. (UNTIL **********ERROR********** (back) or (ebbse) to fix. This undeclared variable is not permitted here. Press (help) for more information. FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SFRCE = 2 7 5 v v r v ' r m r r " r r w v v v v v ir r r ' .J: PROC; DCL.I; .DO. 1=1 .TO. 10, [UNTIL ********** Possible Correction ********** (back) or (erhse) to fix, Replace | | with an arithmetic operator. Press Ihelp) for a d i f f erent suggest i on . FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPACE = 275 J: PROC; DCL.I; .DO. 1=1 ,TO,10, pNTIL ********** Possible Correction ********** (bock) or fSra) to fix. Replace J _ j with__"BYV_ Press Ihelp) for a different suggestion. 151 FILE •= workspace PL/ 1 UJORKSPRCE ( 1 4 - new) 5PRCE = 2 75 ■ ■ ' ■ r r * < * ' ' r ' r r r r r r r r r"r 'r r ■■ /' r J: PROC; DCL.I; .DO. 1=1 ,TO.10. pNTIL ********** Possible Correction ********** (sack) or [erbse) to fix. Repl_ace_[^^ w i th " UHILE " L Press (help) for a different suggestion. FILE = workspace PL/ 1 UORKSPRCE (1 4 -new) SPRCE = 27 5 J: PROC; DCL.I; DO. 1=1 .TO.10. |L)NTIL) ********** Possible Correction ********** (back) or [errseJ to fix. Insert " ; " in front o f | | . 152 FILE = workspace PL/ 1 WORKSPACE (1 4-'new) SPRCE = 283 ' y r r r ' r . r ' r r r r ' r ' r r r ' r i> r r ' " i K: PROC; .DCL.I; FOR U **** ******ERROR* ********* • (Bflck) or [ebhje] to fix. Th^is^ declared variable is not permitted here. Press |hEp| for more information. FILE = workspace PL/I WO RKSPACE (1 4 -new) SPRCE = 283 .K: PROC; DCL.I; FOR .0 ********* Possible Correction ********** (££<*) or Ierme) to fi- Insert u j " in front of | | . Press (help) for a~d7f ferent~sJgies"t7on. FILE = workspace PL/ 1 UQRKSPRCE (14- new) SPRCE = 283 1-1 v v r r ' 7 \ 'r 1 r ' 'r r r r r y ^ y r ^ r « K: PROC; DCL.I; E2I-I ********** Possible Correction ********** (back) or (erike) to fix. Replace | | with a statement. Press (help) for a d i f f erent suggest i on . FILE * workspace PL/I UJORKSPRCE (1 4-newJ SPRCE = 2a3 111 r 7 r w v . y r v v r w v r v r w r r :== i K: PROC; DCL.I; For]. i ********** Possible Correction ********** (brck) or (eribe) to fix. Replace [ J with a reserved word "IF". Press (help) for a different suggestion. 153 FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPFtCE = 28 3 1 / — r r ' v • r " r w r ~ w ' r r " r rr r f r r == r z K: PROC; DCL.I; OEM ********** Possible Correction ********** (ggck) or [ehrse] to fix. Replace | J with a reserved word "DO". Press IhEiP) for a different suggestion. FILE * workspace PL/ 1 WORKSPACE ( 1 4 -new) SPACE - 283 = f r r f f. ' r r r r r r r r r , r r r .K: PROC; DCL.I; Fo^.i' ******** ** Po ssible Correction ********** (beck) or (erSse) to fix. Remove | | from the program. 154 FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPACE = 27 7 T 7 7 7~ L: PROC; DCL.I .FIXED; ****** ****ERROR***# ****** • (back) or (erase) to fix. This reserved word is not permitted here. Press (help) for more i n format i on . FILE = workspace P L/ 1 WORKSPACE (14- new) 5FACE = 277 * " " 7 7 7 7 V _ V 7 7 7 7 7 7 7 7~ V V 7 7 1 .L: PROC; DCL.I FIXED; .1 .= .1 ********** Possible Correction ********** (back) or (erbse) to fix. Replace | | with an arithmetic operator. Press (help) for a different suggestion. FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPACE » 27 7 V 7 7 7 Fj 7 7 7 9 7 7 7 7 7 7 7 7 f 7~ L: . . PROC; DCL.I .FIXED; ... .1=1 (TC ********** Possible Correction ********** (back) or (erase) to fix. Rep lace _ P~n with "; M . Press (help) for a different suggestion. 155 FILE * workspace PL/ 1 WORKSPACE (1 4 -new) SPACE - 27 7 r r ' r r ■ r . r r .. ■ y .. ■ r ■ r . r ■ r r r . r r r ■■ ^» » , L: PROC; DCL.I.FIXED; |.«.l .TO ********** Possible Correction ********** (pack) or [erhse] to fix. Insert a reserved word "DO" in front of | | . 156 FILE « workspace PL/I WORKSPACE (1 4 -new) SPACE * 281 " -■■ - r — v r r r , r r ' r r » r 1 r > r r r r = y f=ss \ M: PROC; DCL.I .FIXED; DCL Q ♦ *****#*#*ERROR** ******** • (BflciF) or (erbse) to fix. This entry name is not permitted here. Press [help) for more information. FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPACE = 281 ■r v r v r ; v r r r w v " v v ] w v •• r w * r~ .M: PROC; DCL.I .FIXED; DCL ********** Poss i b 1 e Correct i on ********** (back) or (erase) to fix. Replace | | with an identifier declaration. Press [help) for a d i f f erent suggest i on . Press umi lHELP) to exam i ne an i dent i f i er dec 1 arat i on . FILE = workspace PL/I WORKSPACE (1 4- new) SPACE - 281 ' " ' v r w r' ' r\ r r ' r r w w w r r ' v w r =sx i^^^ M: PROC; DCL.I .FIXED; DCL .0 ********** Possible Correction ********** (back) or (erbse) to fix. Rep lace _L_J with " (". Press [help) for a d i f f erent suggest i on . 157 FILE - workspace PL/ 1 WORKSPACE (14- neui) SPACE =» 281 — •■ - r w r " ■ w " • v " r 'r ' ' y ' r r " r " • r ' • r' ■ y ' y ' r r >» 1 M: PROC; DCL.I FIXED; DCL Q ********** Possible Correction ********** (back) or fggj| to fix. Replace [ J with an undeclared variable. Press (help) for a different suggestion. FILE * workspace PL/ 1 WORKSPACE ( 1 4 -new) SPACE = 281 > k "r r r.r r r> p r r w y 1 r w w r f .M: PROC; DCL.I .FIXED; pcg.M ♦ ♦♦»*♦»♦» » Pos sible Correction ********** (Sack) or Pbse) to fix. Replace 1 | with "END". 158 FILE -workspace PL/ 1 WORKSPACE (14- new) SPACE = 279 'r r r w v . r r w w w w v r r ' r c r r r N: PROC; DCL.I,fl; PUT. LIST .0 ***** *****ERROR* ********* . dSck) or (erwse) to fix. This punctuation symbol is not permitted here. Press Ihelp) for more information. FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPACE = 2 79 1 r r r r " » . ' y ' r r r i t ■ r ' r r y f r r r r - .N: PROC; DCL.I,A; PUT. LIST .0 ********** Possible Correction ********** (back) or (erase) to ffx. RepUce_ r3~] with " ("\ Press (help) for a different suggestion. FILE = workspace PL/ 1 WORKSPACE ( 1 4 - new) N: PROC; DCL.I,A; SPACE =279 r ' r ' r r w • v w " r~ v v v r r r PUT pST] .; ********** Possible Correction ********** [back] or (erbse) to fix. Replace P~l with "SKIP". Press (help) for a d i f f erent suggest i on . 159 FILE « workspace PL/I WORKSPACE (M-new) SPACE = 279 t ~^— ^ " r r r f ■ » ■ r >" ~ r r " » ' ' y y " ' r ' ^ ^ . r - ■ ■ r ■ ■ r ■■ r p p | N: PROC; DCL I,R; PUT pST] , ; ********** Possible Correction ********** [back] or [erhse] to fix. Replace | | with H Pf=lGE". 160 FILE = workspace PL/I UIORKSPRCE (1 4-new) SPACE = 27 4 7 == r 7 7 v~ m v r v 7 7 7 w r r r r r r v .0: . . PROC; DCL.I.J; PUT LIST . ( . ' START OF .PROGRAM ' , .0 ***** *****ERR0R* ********* • (back) or (erhse) to fix. This punctuation symbol is not permitted here. Press [help] for more i n format i on . FILE - workspace PL/ 1 WORKSPACE (1 4-new) SPACE * 27 4 " - '■ ■ ■ v v r f r . v ' v v w v r r v \r v v w r ' ' r .0: PROC; DCL.I,J; PUT .LIST . ( . ' START .OF .PROGRAM ' , .0 ********** Possible Correction ********** [back) or (erjise) to fix. Insert a declared array in front of | | . Press [help) for a different suggestion. FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPACE « 2 74 ■ ■''■' • y~" v 7 r ' r v~ r v w 7 v 7 r v 7 v r r' v .0: PROC; DCL.I,J; PUT LIST . ( . 'START OF PROGRAM' , .0 ********** Possible Correction ********** (back) or (ERJii) to fix. Insert an expression in front of | | . Press (help) for a d i f f erent suggest i on . Press n:>'Ii'iHELPl to examine an expression. 161 FILE = workspace PL/ 1 WORKSPACE ( 14 - -new) SPACE = 27 4 "^ — r ' r '?• — p | p ■■ p ■ y " ■ r " r " r~ F " r ' r r " r - r a= *y M= *T .0: PROC; DCL.I.J; PUT. LIST. (. 'START. OF. PROGRAM' Q.) ********** Possible Correction ********** (back] or [erhse] to fix. Remove QH from the program. 162 FILE = workspace PL/ 1 WORKSPACE (1 4 -new) SPACE ■ 2 72 • 7 v r v v r rr r r r r