LIBRARY OF THE 
 
 UNIVERSITY OF ILLINOIS 
 
 AT URBANA-CHAMPAIGN 
 
 510. $4- 
 
 T*<br 
 no. 752-757 
 
 cop. 2* 
 
 
The person charging this material is re- 
 sponsible for its return to the library from 
 which it was withdrawn on or before the 
 Latest Date stamped below. 
 
 Theft, mutilation, and underlining of books 
 are reasons for disciplinary action and may 
 result in dismissal from the University. 
 
 UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN 
 
 901 1 3 iq78 
 
 JUL 1 2 REClJ 
 
 MAR 1 5 -fo^ 
 
 ■ 
 
 F 
 
 MAY 6 198' 
 
 
 L161 — O-1096 
 
Digitized by the Internet Archive 
 in 2013 
 
 http://archive.org/details/pattieautomatedt753dani 
 
J I C • J I 
 
 M.^rti 
 
 UIUCDCS-R-75-753 
 
 PATTIE: AN AUTOMATED TUTOR 
 FOR TOP-DOWN PROGRAMMING 
 
 by 
 
 Ronald Lee Danielson 
 
 October, 1975 
 
UIUCDCS-R-75-753 
 
 PATTIE: AN AUTOMATED TUTOR 
 FOR TOP-DOWN PROGRAMMING 
 
 by 
 
 Ronald Lee Danielson 
 
 October 1975 
 
 Department of Computer Science 
 
 University of Illinois at Urbana-Champaign 
 
 Urbana, Illinois 61801 
 
 This work was supported in part by the National Science Foundation 
 under grant EC-41511 and was submitted in partial fulfillment of the 
 requirements for the degree of Doctor of Philosophy in Computer Science 
 at the University of Illinois at Urbana-Champaign, 1975. 
 
iii 
 
 ACKNOWLEDGMENT 
 
 The author would like to express his deep appreciation to his 
 advisor, Professor Jurg Nievergelt, for his advice, interest, and insight, 
 which were not restricted to the scope of this thesis. 
 
 He would also like to thank the members of his thesis committee 
 for their interest and comments: Professors George Friedman, Rich Montanelli, 
 Sylvian Ray, Dave Waltz, and Dan Watanabe. Also, Professor Fred Hansen 
 initially suggested several of the future research ideas appearing in 
 Chapter 5. 
 
 Fellow students Al Davis, Dave Eland, Dave Embley, Janie Irwin, 
 Prabhaker Mateti, and Mike Tindall provided valuable comments and functioned 
 as pseudo-willing guinea pigs. 
 
 Thanks are due to Connie Slovak for an outstanding job of typing, 
 and to the Computer Science Department and the National Science Foundation 
 for support. 
 
 Finally, thanks to Joani for encouragement and enthusiasm for a 
 long time. 
 
iv 
 
 TABLE OF CONTENTS 
 
 Chapter Page 
 
 1. INTRODUCTION 1 
 
 1.1 Motivation for a Program Development Tutor 1 
 
 1.2 Teaching about Structured Programming 2 
 
 1.3 A Sample Dialog 4 
 
 1 . 4 PATTIE ' s Environment - PLATO IV 12 
 
 1.5 Research Problems Posed by PATTIE 13 
 
 2. RELEVANT PREVIOUS RESEARCH 15 
 
 2.1 CAI Tutorial Systems 15 
 
 2.2 Natural Language Processing Methods 22 
 
 2.2.1 Pattern Matching 22 
 
 2.2.2 Linguistic Analysis 25 
 
 2.2.3 Hybrid Approaches 30 
 
 3. GETTING TO KNOW PATTIE 33 
 
 3.1 Introduction 33 
 
 3.2 Relation to Other Tutors 34 
 
 3.3 What PATTIE Knows — The Data Bases 37 
 
 3.3.1 The Solution Graph 37 
 
 3.3.1.1 AND-OR Graphs 37 
 
 3.3.1.2 Special Features 43 
 
 3.3.1.3 Advantages 49 
 
 3.3.2 Natural LangiTage Understanding Features 50 
 
 3.3.3 The Student Model 54 
 
V 
 
 Chapter Page 
 
 3.4 How PATTIE Behaves — The Interaction 56 
 
 3.4.1 What the Student Sees — The Screen Display 59 
 
 3.4.2 Interaction — The Ideal Case 62 
 
 3.4.3 Handling Errors 72 
 
 3.4.3.1 Lookahead 73 
 
 3.4.3.2 Hint Structure 77 
 
 4. SENTENCE AND DIALOG PROCESSING 84 
 
 4.1 Characterizing Natural Language Communication 84 
 
 4 . 2 Dialog Processing Methods 86 
 
 4.3 Sentence Processing 89 
 
 4.4 Understanding Problem Solving Dialogs 91 
 
 5. SUMMARY AND CONCLUSION 95 
 
 5 . 1 Summary 95 
 
 5.2 Recommendations for Further Work 97 
 
 5 . 3 Conclusion 99 
 
 LIST OF REFERENCES 100 
 
 APPENDIX 103 
 
 VITA 114 
 
vi 
 
 LIST OF FIGURES 
 
 Figure Page 
 
 2.1 Sample problem and partial flowchart from MALT 18 
 
 2.2 PARRY2's pattern matching 24 
 
 2.3 A simple transition network grammar 27 
 
 2.4 Flow diagrams for a simple program grammar (from Winograd 
 (1974) ) 29 
 
 2.5 BNF description of part of a semantic grammar 31 
 
 3.1 Sample AND-OR graph 39 
 
 3.2 Refinement as an AND-OR graph 41 
 
 3.3 Partial solution graph 42 
 
 3.4 Programming/problem solving concepts 45 
 
 3 . 5 Refinement graph special features 46 
 
 3 . 6 Sample vocabulary and meaning lists 52 
 
 3.7 Design trees (from Ells and Freeman (1973)) 58 
 
 3.8 The student's screen display 60 
 
 3.9 Flow diagram of interaction control routine 63 
 
 3.10 Screen dynamics at an OR node 65 
 
 3.11 Screen dynamics at an AND node 67 
 
 3.12 Ordering clauses in an IF statement 71 
 
 3.13 Need for additional branches 74 
 
 3.14 Correct matches during lookahead 76 
 
 3.15 Inserting additional OR nodes 78 
 
 3.16 Sample node with error facilities 81 
 
vii 
 
 Figure Page 
 
 4.1 Anticipation of next unit of communication from 
 
 preceding dialog 88 
 
 4.2 Anticipation of inputs 90 
 
 4.3 Problem solving protocol (from Chapanis (1975)) 93 
 
 A. 1 High-level control flow 104 
 
 A. 2 Data usage 105 
 
 A. 3 Node accessing scheme 107 
 
 A. 4 Node header format 107 
 
 A. 5 Branch header format 107 
 
1. INTRODUCTION 
 
 1.1 Motivation for a Program Development Tutor 
 
 Computers are becoming more and more common items in society, and 
 more and more people are required to interact with them on a daily basis. 
 Consequently, many educational institutions are concerned with instilling 
 a certain degree of "computer literacy" in their students, as an essential 
 part of the educational experience. This concern is reflected in rapidly 
 increasing enrollments in introductory programming courses, which in turn 
 places increased emphasis on techniques for teaching programming skills. 
 
 Programming is essentially a problem solving activity. Yet the 
 typical introductory programming course concentrates on the syntax of a 
 particular language and a few more or less relevant examples, and says 
 nothing about how to develop a program which solves a given problem (Gries 
 (1974)). This aspect of programming has been left to the student's trial 
 and error experimentation, with predictably poor results. Student's programs 
 are often wrong when looked at as a whole, yet each small segment is correct 
 when considered independently. Students simply don't know how to put together 
 complete programs; they've learned what they've been taught, but have been 
 taught the wrong things (Snark (1972)). 
 
 One reason that beginning students have seldom been taught how to 
 develop a program is that, until recently, there has not been a generally 
 accepted method of program development. However, beginning with the work 
 of Dijkstra (1970), an approach to program development has emerged which 
 
eases the problems of constructing correct programs. This approach has 
 come to be known as "structured programming." 
 
 The increasing numbers of students who want to learn programming 
 skills can be accommodated via the medium of computer-assisted instruction 
 (CAI), as demonstrated by the Automated Computer Science Education System 
 (ACSES, Nievergelt and Reingold (1973)), implemented by the Department of 
 Computer Science at the University of Illinois on the PLATO IV CAI system 
 (Alpert and Bitzer (1970)). However, as the preceding comments indicate, 
 if the products of such a system are to be effective programmers, some 
 provision must be made to teach them about program development techniques 
 (i.e., about structured programming). What means should be used to explain 
 the program development process? 
 
 1.2 Teaching about Structured Programming 
 
 The two basic tenets of structured programming are the use of 
 restricted control structures (e.g., IF. .THEN. .ELSE, DO WHILE) and the use 
 of the top-down programming method, or successive refinement (Wirth (1971)). 
 The restricted control structures impart a logical, easily readable 
 structure to the final program; top-down programming provides a means for 
 the programmer to restrict the scope of the problem he must solve to a 
 manageable level. 
 
 The principal aid in this restriction of scope is the use of 
 levels of abstraction. Successive refinement begins with an abstract 
 description of the task to be accomplished. This task is then refined, 
 that is, described as a series of slightly more specific tasks which, when 
 combined, solve the problem. Each of these tasks at this second level is 
 refined in turn, producing a third level of task descriptions, and the 
 
process continues until tasks have been described in sufficient detail 
 
 to be easily translated into programming language statements. Task 
 
 descriptions commonly employ a mixture of natural language and programming 
 
 language statements, which allows much of the complexity of the programming 
 
 language to be ignored until needed. The successive levels of task 
 
 descriptions allow the programmer to concentrate most of his attention 
 
 on the task he is currently refining, and yet be sure of the proper 
 
 integration of that task with the whole solution. 
 
 The approach which has commonly been used to explain structured 
 
 programming, both in texts and in the classroom, is to display two 
 
 completed programs which solve the same problem, and assert that one was 
 
 developed using structured programming and the other was not. Comparison 
 
 of the two programs can point up the difference in program appearance 
 
 imparted through the use of restricted control structures, but it can 
 
 relate nothing about the other aspect of structured programming, the process 
 
 of stepwise refinement used in program development. As Denning says (1974): 
 
 It is not sufficient to present the end product of 
 the program development process and expect the 
 beholder to perceive its structure by inspection, 
 or even by deep meditation. Instead, the beholder 
 must also be able to see at least part of the 
 programmer's thought processes, starting from the 
 original (very abstract) version, and proceeding to 
 the end product through a clearly presented sequence 
 of clear transformations and refinements. 
 
 Unfortunately, it is very difficult to adequately explain such a 
 
 detailed example to a large class of beginning programmers. A near-ideal 
 
 situation would be to assign each student an individual tutor who could 
 
 explain stepwise refinement by actually helping the student develop programs 
 
 in a top-down manner. Few beginning programming courses are blessed with 
 
 enough instructional personnel to make the ideal a reality. 
 
In the environment of a highly interactive CAI system, however, 
 providing each student with a knowledgeable, personal tutor to help him 
 solve problems in a top-down manner seems to be a reasonable goal. The 
 remainder of this thesis discusses the design of PATTIE (Programmed Aid 
 for Teaching Top-down programming by Interactive Example), an automatic 
 tutor of top-down programming implemented as part of the ACSES system. 
 PATTIE mimics the action of a human tutor, in that she presents the student 
 with a programming problem to solve, and then monitors his attempts at 
 solving that problem via a top-down approach. During the course of 
 developing a solution, PATTIE and the student engage in an interactive 
 dialog, in which the student suggests refinements that are needed, and 
 PATTIE judges the correctness of those refinements and provides hints and 
 comments where necessary. Let's look at a portion of such a dialog which 
 demonstrates some of PATTIE' s capabilities. 
 
 1.3 A Sample Dialog 
 
 The problem being worked on in this dialog is to design a PL/1 
 subroutine which will return the symbolic derivative of a single term of 
 a polynomial of two variables. The term is passed to the subroutine as a 
 character string of the form 
 
 coef f icier) t*variable**exponent*variable**exponent 
 
 and the variable to differentiate by is passed as a second parameter. The 
 state of the solution program at the beginning of the dialog is 
 
 DIFTERM: PROC(T,Z); 
 
 DCL (T,Z) CHAR; 
 
 differentiate T with respect to Z 
 END DIFTERM; 
 
In solving the problem, the student tries to break the initial task into 
 a series of increasingly more detailed actions. Comments on the dialog 
 are enclosed in square brackets []. 
 
 PATTIE: Tell me how you intend to proceed. 
 
 [The student is allowed three chances to 
 correctly describe the refinements of any 
 given task. PATTIE has a series of three 
 standard phrases which she shows the student 
 to prompt entry of a refinement. This is the 
 first and most general phrase. The succeeding 
 prompts provide very general hints about what 
 inputs PATTIE expects. The student may respond 
 to a prompt by suggesting a refinement, or ask 
 for help by typing something like] 
 
 Student: I don't know 
 
 P: It might help to look for distinct subcases. 
 
 [This is one of 16 general hints which may be 
 associated with a task in PATTIE' s data base. 
 The hints are based on problem solving concepts 
 (such as this one) or concepts inherent in the 
 PL/1 language (loops, recursion, subroutines, 
 etc.) and provide more explicit information to 
 the student than the standard prompt sequence] 
 
 S: Maybe I should look for special cases. 
 
 P: OK 
 
 [This indeed was one of the refinements 
 PATTIE was looking for] 
 
6 
 
 Do you want to give this task an identifying letter? 
 
 [Each task description is assigned a "task 
 name". The initial task in each procedure 
 is task 0; subsequent refinements are 
 assigned a name which indicates the relation 
 of that task to other tasks described during 
 the refinement process. The student can add 
 an additional letter as a prefix to this name 
 if he desires] 
 
 S: Yes 
 
 P: What letter? 
 
 S: s 
 
 P: OK 
 
 This is task si. 
 
 What else must be done to refine task 0? 
 
 [ If more than one subtask is required to 
 refine the task being worked on, the student 
 must describe all of them before he may refine 
 one of them further. This ensures he is always 
 aware of all actions needed to achieve the 
 solution. ] 
 
 S: Perform the differentiation in general. 
 
 P: OK 
 
 [To avoid repetition, we will assume the 
 student simply allows PATTIE to assign task 
 names, and denote the naming sequence inter- 
 change by ] 
 
(task name sequence - 2) 
 
 Now let's consider task si. 
 
 Tell me how you intend to proceed. 
 
 [The standard prompting sequence begins 
 again as each task becomes the focus of the 
 refinement process] 
 S: See if the term equals zero. 
 P: You can assume T will never equal zero. 
 
 [in addition to the 16 general hints, it is 
 possible to associate explicit hints with any 
 task in PATTIE's data base. Such explicit 
 hints may be keyed to erroneous refinements 
 which could be suggested for that task, as in 
 this example] 
 S: See if Z is contained in the character string T 
 P: OK 
 
 (task name sequence - sl.l) 
 
 [Notice task names are formed by adding a 
 suffix to the name of the task being refined] 
 What else must be done to refine task si? 
 S: If not, return an answer of 0. 
 P: OK 
 
 (task name sequence - si. 2) 
 Now let's consider task sl.l. 
 Tell me how you intend to proceed. 
 S: I want to declare a character variable 
 
8 
 
 P: OK 
 
 DCL NDXZ CHAR; 
 
 [PATTIE displays all PL/1 statements for 
 the student once the corresponding task 
 descriptions are sufficiently exact. As a 
 result, the student must ask for variables 
 to be declared. If PATTIE agrees that the 
 description of the task being refined 
 indicates a need for a variable, a PL/1 
 declaration statement is inserted in the 
 developing program, with PATTIE assigning 
 the variable name. If not, the student's 
 suggestion is rejected] 
 
 What else must be done to refine task sl.l? 
 S: Save the index of Z in NDXZ 
 
 [ The student may use the names of declared 
 variables in subsequent refinements] 
 P: OK 
 
 NDXZ=INDEX(T,Z); 
 
 Task sl.l is completed. 
 
 [ Once a task has been completed (converted 
 
 to PL/1), the refinement process focuses in turn 
 
 on other tasks at the same level] 
 
 Task si. 2 is ready to be written in PL/1. 
 
 If NDXZ=0 
 
 THEN RETURN('O'); 
 
Task si is now completed. 
 
 [A display sequence such as the above 
 requires no refinement descriptions from 
 the student. However, the student is able to 
 control the pace at which these changes are 
 made to his solution, since after each statement 
 PATTIE pauses and asks the student to press a 
 control key when he is ready to continue] 
 
 Let's consider task 2. 
 
 [When all tasks at a given level have been 
 completed, refinement continues with the next 
 task on the next higher level] 
 
 Tell me how you intend to proceed. 
 S: Assign the location of the * after the exponent to a variable 
 
 [At times, the student may suggest a refinement 
 which will be valid later in the solution, but 
 is not applicable to the task he's currently 
 refining. Here, for example, he hasn't indicated 
 how to find the exponent] 
 P: We are refining task 2. 
 
 You have described task 2.3.1.2. 
 
 What might task 2.3 be? 
 
 [ PATTIE uses task names to describe the 
 relation between where the refinement suggested 
 should appear and the task currently being 
 refined, and asks the student to try again] 
 
10 
 
 S: I don't know 
 
 P: Try to break the problem into simpler subproblems. 
 
 [The second prompt in the standard sequence] 
 S: Separate the coefficient from T 
 P: OK 
 
 (task name sequence - 2.1) 
 What else must be done to refine task 2? 
 S: Declare a variable 
 
 [ This is a necessary part of separating the 
 coefficient, but is not one of the refinements 
 still needed for task 2, which is what PATTIE 
 is looking for] 
 P: NO 
 
 My solution breaks this task into 4 subproblems. 
 
 [ The third standard prompt] 
 S: Help 
 
 [ The student has now used up his three chances 
 to describe the refinements needed for the 
 current task. PATTIE now displays the 
 remaining refinements, but one at a time, 
 allowing the student to suggest others if they 
 should occur to him] 
 P: There are still 3 tasks to describe. 
 
 One is save the string between the coefficient and exponent. 
 
 I will call this task 2.2. 
 
 What else must be done to refine task 2? 
 
11 
 
 Nothing 
 
 [ Any wrong response results In the display 
 of the next subtask] 
 
 Another is separate the exponent from the tail of the string. 
 
 I will call this task 2.3. 
 
 [ This process continues until either the 
 student or PATTIE have described all the 
 necessary subtasks. Then refinement proceeds 
 with the next task in order; in this case, 
 task 2.1] 
 
 At the close of this dialog, the student's partial solution looks like: 
 DIFTERM: PROC(T,Z); 
 DCL (T,Z) CHAR; 
 sl.l DCL NDXZ FIXED; 
 
 NDXZ=INDEX(T,Z); 
 si. 2 IF NDXZ=0 
 
 THEN RETURN('O'); 
 
 2.1 separate the coefficient from T 
 
 2.2 save the string between the coefficient and exponent 
 
 2.3 separate the exponent from the tail of the string 
 
 2.4 simplify the pieces and return the answer 
 END DIFTERM; 
 
12 
 
 This dialog has demonstrated various aspects of the inter- 
 action between PATTIE and the student, particularly (1) the existence 
 of multiple levels of hints, (2) use of task names to describe the 
 relationship between refinement tasks, (3) the requirement that all 
 necessary refinements of a given task be described before further refining 
 any subtask, and (4) PATTIE 's translation for the student of task descriptions 
 into PL/1, once those descriptions are sufficiently precise. The reader 
 may find it helpful to refer back to this dialog during the discussion in 
 Section 3.4 of interaction between PATTIE and the student. 
 
 1.4 PATTIE 's Environment - PLATO IV 
 
 As we briefly mentioned, PATTIE is implemented on the PLATO IV 
 system at the University of Illinois. PLATO IV is a timesharing system, 
 running on a CDC Cyber 73 computer, which is dedicated to CAI use and designed 
 to support up to 1000 simultaneous users. To provide adequate service for 
 such a large number of users, PLATO imposes rigid constraints on instructional 
 programs ("lessons"), many of which have influenced PATTIE' s design. Briefly, 
 these restrictions are: 
 
 (1) no lesson may be larger than 10000 60-bit words; 
 
 (2) the data bases associated with a lesson may occupy 
 at most 8000 additional words 
 
 (3) each lesson is restricted to an average of 2 to 5 
 milliseconds of CPU usage per second; 
 
 and (4) programs must be reentrant and data bases may not be 
 modified, as only one copy of program and data are 
 maintained for all students using that lesson. 
 
 
13 
 
 The Appendix contains a short discussion of PATTIE's 
 implementation, including program and data memory requirements, and 
 a description of the data structures used in the program. 
 
 1.5 Research Problems Posed by PATTIE 
 
 The primary motivation behind the development of PATTIE was 
 to provide a practical tool, within the context of an automated 
 instructional system for computer science, which would expose beginning 
 programming students to the concepts of top-down program development. 
 The problems encountered in developing such a tool divide basically into 
 two areas: representation of knowledge and teaching strategy for an 
 automatic tutor. 
 
 Because PATTIE must monitor the process of developing a program, 
 it was necessary to devise a compact representation for acceptable methods 
 of solving a problem, as well as acceptable completed solution programs. 
 The representation developed allows knowledge of both types to be uniformly 
 represented in a single data base, and also permits inclusion of specific 
 responses to anticipated student errors. In addition, a student model was 
 developed to allow PATTIE to adapt to each student's progress. The model 
 provides knowledge of the individual student's performance in relation to 
 the single data base of solutions and solution methods. 
 
 PATTIE's instructional strategy is embodied in the routine which 
 controls the interaction with the student, using the solution knowledge in 
 the data base mentioned above. The current strategy reflects our experience 
 with some 30 beginning students who volunteered to use various versions of 
 PATTIE. Some important components of the strategy are multiple levels of 
 hints, translation of refinement descriptions into PL/1 for the student, 
 
14 
 
 immediate correction of errors, and explicit description and display 
 
 of the level structure of stepwise refinement in the developing solution, 
 
 In addition to the above, the importance of natural language 
 to the successive refinement process required that PATTIE be given at 
 least a minimal natural language capability, which had to relate to the 
 knowledge of acceptable solutions. 
 
15 
 
 2. RELEVANT PREVIOUS RESEARCH 
 
 2.1 CAI Tutorial Systems 
 
 The idea of an individual tutor for every student has been a 
 central concept in CAI programs from the beginning of the field. The role 
 of the computer in tutoring has been successively refined, but the ideal 
 of equaling the performance of a human tutor has yet to be realized. In 
 general, such a level of performance should not be expected; however, 
 several programs have shown such abilities are feasible in specific, narrow 
 domains of discourse. 
 
 The current generation of tutorial programs employ various 
 techniques pioneered in artificial intelligence research to improve their 
 performance. The first such tutors were two systems written by Carbonell 
 (1970) and Wexler (1970) which introduced the idea of "generative CAI." 
 A generative system is one in which material is taught by generating facts 
 and questions from a data base, rather than by following a detailed prepared 
 script. 
 
 Wexler' s system used a structured knowledge representation 
 consisting of "classes" and collections of "objects" within classes. 
 Question formats and skeleton answer patterns had to be supplied to the 
 system by an instructor. Questions were generated by filling in the blanks 
 in the format with class names or objects. The skeleton patterns also 
 contained blanks and indicated restrictions on objects which had to be 
 satisfied for the filled-in pattern to be correct. The patterns were used 
 
16 
 
 to present information and judge student answers. Student and system 
 engaged in a limited dialog (the system assumed all non-trivial words 
 in a student's input were class or object names). Correction of wrong 
 student answers consisted of describing the relationship between the 
 objects substituted in the question format. 
 
 Carbonell's SCHOLAR system used a data base organized as a 
 Quillian-like semantic net (Quillian (1968)). In such a net, objects 
 are represented by nodes, and links between nodes represent relations 
 between objects. Thus SCHOLAR could present information or ask questions 
 by simply following a path through the semantic net. To aid in judging 
 answers, students were restricted to single word, multiple-choice, or 
 true-false responses. The only form of error correction was to display 
 the correct answer, with no explanation. 
 
 In contrast to these restricted student answer formats, SCHOLAR 
 possessed an interesting and effective question answering capability. 
 This capability allowed what Carbonell referred to as a "mixed-initiative 
 dialog." The student could interrupt the flow of information and questions 
 from SCHOLAR to ask questions of his own, in a relatively unrestricted 
 subset of English. This ability allowed the student to pursue interesting 
 topics to a greater depth, and exercise some control over the tutorial 
 session. 
 
 From PATTIE's point of view, these two tutorial systems are 
 important for two reasons: 
 
 (1) their pioneering use of artificial intelligence 
 techniques, particularly use of natural language 
 in a tutorial dialog and the emphasis on effective 
 representation of knowledge; 
 
17 
 
 and (2) demonstration of the power and versatility 
 of maintaining a sharp separation between 
 subject matter knowledge and interaction 
 control routines (the basic SCHOLAR control 
 routine has been improved based on studies of 
 human tutors (Collins, et al. (1973)), and also 
 adapted to two other subject areas). 
 The success of these two systems led to the development of several tutorial 
 programs more closely related to PATTIE. 
 
 Koffman and Blount (1973) applied the concept of generative CAI 
 to a system to tutor machine language programming. In their MALT system 
 they have isolated 35 simple concepts (such as terminating a loop or 
 initializing a counter) which they feel are essential to programming in 
 machine language. A programming problem is viewed as consisting of three 
 phases: input, processing, and output. The system contains a set of 
 "problem primitives" (f ill-in-the-blank problem outlines) for each phase; 
 generating a problem for a student involves simply selecting a primitive 
 from each phase. Associated with each primitive is an ordered list of 
 those concepts which must be employed to solve the primitive task. This 
 list essentially represents a flowchart for solving that problem primitive, 
 and when the generated problem is presented to the student, the flowchart 
 for each primitive involved is also presented (see Figure 2.1 for a sample 
 problem and partial flowchart). Judging of the student's program is 
 accomplished by calling concept routines (one routine for each simple 
 concept) in the order specified in the flowchart. Each concept routine 
 monitors the machine language instructions typed by the student to insure 
 that these instructions will perform the action represented by that concept 
 
18 
 
 Your problem is to write a program which will: 
 
 Read in 20 (octal) ASCII characters and store them in registers 232 through 252, 
 
 Form the sum of registers 232 thru 252 in the accumulator. 
 
 If this results in a non-zero LINK, stop with the (ACC) = 7777, otherwise stop 
 
 with (ACC) = 0000. 
 
 Is this problem OK? 
 _yes 
 
 Here are the sub-tasks for the 1st line. 
 
 (1) initialize a ptr to register 232 
 
 (2) initialize a ctr with the value of -20 (octal). 
 
 (3) Read in a character. 
 
 (4) Store it away using the ptr. 
 
 (5) Update the ptr. 
 
 (6) Update the ctr and if it is not zero, jump back to start of loop. 
 
 Figure 2.1 Sample problem and partial 
 flowchart from MALT 
 
 
19 
 
 Then that concept routine returns and the next routine in sequence is 
 called, until the student has programmed the complete problem. 
 
 Notice that by explicitly displaying a detailed flowchart 
 which describes exactly how the program should be written, Koffman and 
 Blount have eliminated the problem solving aspect of the programming 
 process, retaining only the detailed coding aspects. This approach may 
 be justifiable for tutoring students with only the most elementary 
 knowledge of programming, but it is not acceptable for more advanced 
 students. 
 
 About the same time MALT was developed, two systems concerned 
 with improving students' problem solving methods were developed at Stanford. 
 One system by Kimball (1973) is a tutor for methods of integration; the 
 other is a tutor for elementary logic (Goldberg (1973)). 
 
 Kimball's system has two major sources of knowledge. One is a 
 sizeable archive of integration problems and their solutions. The other is 
 a series of heuristic routines for applying specific integration techniques 
 to problems, routines based on previous research in automated integration. 
 The system is capable of providing tutorial help for problems suggested by 
 the student as well as for prestored problems. The tutor's main source of 
 hints is its problem archive, which is divided into classes based on the 
 type of function to be integrated. The problem the student is solving is 
 classified in the same manner. When a hint is needed, the solution technique 
 to recommend is determined by ranking the frequency of the various techniques 
 used in the tutor's stored solutions for that problem type. The student's 
 first request for help yields the highest ranking technique; if this fails, 
 subsequent requests yield successively lower ranking techniques. Once an 
 integration technique has been selected, a separate heuristic routine for 
 
20 
 
 that technique is invoked, which applies the method for the student. 
 Each heuristic routine has checks to insure that the desired technique 
 is applicable to the specific problem, and is capable of providing 
 further assistance to the student about using that technique, such as 
 recommending substitutions for parameters. 
 
 By using this general hint facility and its built-in problem 
 solving routines, the tutor can always try to extend the student's partial 
 solution, and doesn't need to restrict the student to one particular 
 solution approach, even for prestored problems. In return for this, 
 however, the hints given by the tutor are not necessarily the best for a 
 given problem, only the most likely for that problem class. And the tutor 
 is unable to indicate why a particular approach wasn't successful. 
 
 To partially compensate for this uncertainty, Kimball's tutor 
 forces the student into a "slave mode" whenever his performance falls below 
 a certain level, and displays a step-by-step solution to one of the 
 prestored problems, to illustrate the desired techniques. 
 
 Goldberg's logic tutor also has a two-part knowledge base, one 
 part being a proof-checking routine, the other a heuristic theorem prover. 
 The two forms of knowledge interact as follows. The student suggests, or 
 is presented with, an expression to prove. He types in, one at a time, 
 the steps to be used in the proof, using a special command language. The 
 proof-checker determines the validity of each step, and if it's acceptable, 
 generates the resulting expression as a new line in the proof. Thus the 
 proof-checker alone is capable of teaching the applicability of inference 
 rules and the proper use of axioms and lemmas. If at some point in the 
 solution process the student finds he can't proceed, the theorem prover 
 tries to complete his partial solution. The inference rules available to 
 
21 
 
 the theorem prover were empirically determined to be those used by 
 students, so the proof steps discovered are similar to those which might 
 be discovered by a student. When an extended proof has been found, it 
 is returned to the proof-checker/control routine which generates a hint 
 to help the student find the next step in the extended proof. 
 
 One of the most recent tutorial systems is SOPHIE (Brown and 
 Burton (1974)), a system for providing tutorial assistance in electronic 
 troubleshooting. In operation, SOPHIE inserts a fault in a circuit and 
 provides the student with the output values of the circuit. The student 
 tries to find the error, and can ask SOPHIE to make various measurements 
 for him, suggest possible faults, or explain why his suggested faults aren't 
 correct . 
 
 Unlike the tutors of Goldberg and Kimball, SOPHIE has detailed 
 knowledge of only a single problem domain (the circuit for a particular 
 power supply) . This is necessary because of the complex interactions 
 between the elements of an electrical circuit, as compared, for example, 
 to the relative simplicity of applying an axiom in proving a theorem. Also, 
 SOPHIE has a well-developed dialog capability, echoing Carbonell's original 
 emphasis on the importance of a mixed initiative dialog during the 
 tutorial process. 
 
 However, like the preceding two systems, much of SOPHIE' s 
 knowledge is in the form of internal subject matter "experts." The heart 
 of the system is a pair of simulation programs which determine the effects 
 of changes to the circuit and provide measurement values to the student. 
 In addition, there are hypothesis generation routines which provide possible 
 explanations for observed circuit behavior (to be used as hints for the 
 
22 
 
 student), and hypothesis evaluation routines which check student- 
 suggested faults for agreement with observed circuit behavior. 
 
 In summary, the main points stressed by these tutorial systems 
 are: 
 
 (1) the usefulness of separating the control function 
 from subject matter knowledge; 
 
 (2) the value of a natural language tutorial dialog; 
 
 (3) the importance of trying to extend the student's 
 partial solution when he needs help; 
 
 and (4) the value of applying techniques for the student, 
 leaving him more opportunity to concentrate on 
 the method used to obtain a solution. 
 
 2.2 Natural Language Processing Methods 
 
 This section is intended to give a short overview of several 
 approaches to the problem of understanding natural language input, and 
 thus provide a background for the discussion in Chapter 4 regarding which 
 language understanding methods are best suited to a system such as PATTIE. 
 We have made no attempt to be exhaustive in this review, and several 
 interesting variants on the basic approaches described here have been 
 omitted. A more comprehensive survey may be found in Winograd (1974). 
 
 2.2.1 Pattern Matching 
 
 One of the earliest forms of language understanding was the 
 pattern matching approach, which required analyzing the domain of discourse 
 to determine probable inputs, and establishing patterns to match those 
 inputs. Patterns consist of a few keywords which must be present in the 
 input. All the understanding routine need do is extract the keywords from 
 the input and scan the list of patterns for a match. 
 
23 
 
 Perhaps the best-known early pattern matching program was 
 ELIZA (Weizenbaum (1966)), whose DOCTOR script mimicked the behavior of 
 a Rogerian psychiatrist. ELIZA used patterns to extract portions of 
 the input for use in responses, and although the technique was quite 
 simple, the results could be quite impressive. However, because ELIZA 
 made no attempt to relate inputs to some store of world knowledge, it 
 could be easily fooled. For example (after Winograd), if a pattern 
 
 matched an input "I am " and responded "How long have you been 
 
 ," the exchange 
 
 PERSON: I am very lonely. 
 
 ELIZA: How long have you been very lonely? 
 seems quite natural. But the same pattern produces the following exchange 
 
 PERSON: I am therefore I think. 
 
 ELIZA: How long have you been therefore you think? 
 which seems far from natural. 
 
 A more recent approach used by Colby, et al. (1974) in their 
 paranoia simulation program called PARRY2 tries to overcome these 
 difficulties by combining a more sophisticated pattern matching capability 
 with routines which manipulate a world model and employ the world model 
 information in generating a response. The basic idea of the pattern matching 
 is illustrated in Figure 2.2. 
 
 Each word in the input is translated into a standard internal 
 synonym using a "dictionary." Unrecognized words are simply ignored. The 
 translated input is then bracketed into segments; bracketing points are 
 indicated by prepositions, wh-forms (what, who, which), or certain verbs, 
 points which tend to separate embedded clauses and prepositional phrases 
 from the main clause. Each segment is then matched against a list of simple 
 
24 
 
 HOW DO YOU COME TO BE IN THE HOSPITAL? 
 
 Bracket into 
 segments 
 
 (HOW DO YOU COME) (TO BE) (IN THE HOSPITAL) 
 
 Match against 
 simple pattern 
 lists 
 
 ( ) 
 
 -»■ (IM HOSPITAL) 
 
 ( ) 
 
 -»• (TO BE) 
 
 ( ) 
 
 ■». (HOW YOU COME) 
 
 (HOW YOU COME) (TO BE) (IN HOSPITAL) 
 
 Match against 
 complex pattern 
 list 
 
 (( )( )( )) 
 
 ■*• ((HOW YOU COME) (IN" HOSPITAL))" 
 (( )( )) 
 
 response and inferencing routines ♦- 
 
 appropriate semantic information 
 
 Figure 2.2 PARRY2 ' s pattern matching 
 
25 
 
 patterns. If no match is found, single words are dropped from the 
 segment one at a time, and a match against the simple pattern list is 
 tried again. This heuristic provides a "fuzzy" matching capability 
 which allows familiar words to be ignored in unfamiliar contexts. 
 
 Matched patterns have pointers to semantic information which 
 is used by routines which manipulate PARRY2's model of the world and 
 generate a response. However, if more than one segment of the input 
 matched a simple pattern, a second match is tried against a smaller list 
 of complex patterns. If no exact match is found, fuzzy matching is tried 
 again, but this time segments are dropped instead of single words. This 
 complex matching helps PARRY2 avoid some of the problems ELIZA had. "I 
 am therefore I think" would be segmented as ((I am) (therefore I think)) 
 and would not match the simple "I am " pattern. 
 
 While pattern matching routines have the advantage of being 
 relatively easy to program and fast to execute, they have the disadvantage 
 of requiring a detailed analysis of the domain of discourse (PARRY2's 
 patterns are based on over 4000 interviews conducted with an earlier 
 PARRY) and thus being more difficult to adapt to other domains. 
 
 2.2.2 Linguistic Analysis 
 
 At the other extreme from the pattern matching approach are those 
 systems which perform a detailed linguistic analysis of the inputs to 
 determine meaning. Such systems typically contain detailed grammars of 
 English and use knowledge from some world model to determine referents of 
 pronouns and various dependent clauses and phrases. There are two systems 
 which have had a great deal of influence on the direction other linguistic 
 based analyzers have taken, namely the systems of Woods (1970) and Winograd 
 (1971). 
 
26 
 
 Woods' parser relys mainly on syntax to control the recognizer; 
 semantic routines are used only when absolutely necessary. Its knowledge 
 of English syntax is contained in an augmented transition network 
 grammar (Figure 2.3). Basically, this is a transition diagram similar to 
 a finite-state automaton. Nodes represent states in the parsing of an 
 input, and arcs between nodes are associated with conditions which must 
 hold true in the input before the arc may be traversed. The parsing 
 routine interpretively traverses this graph to understand the input. One 
 node corresponds to the initial state of the parser. The first word is 
 selected from the input and one of the arcs whose condition is satisfied 
 is traversed to a new node, where the next input word is examined, and so 
 on. If at any time the parser is stopped (no arcs from the current node 
 are tagged with a true condition) before the input is accepted, it backs 
 up to the previous node and tries a different arc, etc. 
 
 Two extensions to this simple transition graph provide the power 
 to deal with unrestricted natural language. The first is allowing 
 networks to recursively call other networks. The condition on an arc is 
 not limited to a condition on a single input symbol, but may be something 
 like "NP," where NP is the initial state of another network (Figure 2.3). 
 Trying to traverse this arc initiates traversal of the NP network, which 
 tries to accept some string from the input. If it succeeds, it returns to 
 the original network at the end of the NP arc. If not, it returns to the 
 original node, and another arc must be attempted. 
 
 The second extension is to provide a set of registers, which may 
 be referred to in an arc condition, and to provide as well a series of 
 register changing actions which may be attached to arcs. Traversing an arc 
 causes these actions to be performed, changing the registers. This allows 
 information to be passed from one level of the network to another. 
 
27 
 
 ) Cal1 NP 4 S/NP ) CallV? J S/VP 
 
 Figure 2.3 A simple transition network grammar 
 
28 
 
 As an example, consider how the grammar of Figure 2.3 would 
 parse "The dogs ate a steak." Parsing begins in state S, and the network 
 NP is called. NP accepts "The dogs," then returns to state S/NP. Network 
 VP is then called, which accepts "ate," calls NP to accept "a steak," and 
 returns to state S/VP, which accepts the input. 
 
 Winograd's system uses a grammar consisting of programs written 
 in a special language (PROGRAMMAR) . Figure 2.4 shows the flowcharts for 
 the programs in a grammar which accepts the same simple language as the 
 network grammar of Figure 2.3. PROGRAMMAR programs may recursively call 
 other programs, and the input is used to control the execution of such 
 programs just as it was used to control the traversal of a network grammar. 
 
 Obviously, PROGRAMMAR grammars have the same capabilities as 
 network grammars as far as recognizing a language. The main difference 
 between the two systems is how the grammars are used in the recognition 
 process. Where Woods' system uses semantics only when absolutely necessary, 
 Winograd's system intimately merges syntactic and semantic processing. As 
 soon as the grammar programs develop a syntactic structure, a semantic 
 routine is called to see if the structure makes sense, and the answer is 
 used to direct the parser. Semantic programs, in turn, can call deductive 
 routines which interrogate the data bases of world knowledge and the dialog 
 history. The ease with which such intimate interrelation of various types 
 of knowledge can be controlled is one of the principal advantages of 
 PROGRAMMAR grammars. 
 
 In theory, such linguistic based understanding systems may be 
 used in other subject areas by simply changing the world knowledge data 
 base. They also handle much more complex inputs than is possible with a 
 pattern matching approach. The primary disadvantages of a linguistic 
 
29 
 
 DEFINE program SENTENCE 
 
 . No 
 
 > p 
 
 RETURN failure 
 
 No 
 
 n n 
 
 . Yes 
 
 
 
 RETURN success 
 
 DEFINE program NP 
 
 
 
 ^"PARSE oV. 
 
 No 
 
 RETURN failure 
 
 < s rjETERMINER/ > ' 
 
 
 |Yes 
 ^^PARSE o\ 
 
 No 
 
 it 
 
 < \^ NOUN^^ 
 jYes 
 
 
 
 RETURN success 
 
 
 DEFINE program VP 
 
 ^^PARSE a\. 
 
 No 
 
 k 
 
 OCTI If, 
 
 ->* 
 
 (_:■.'.-_ 
 
 ^^VERB^X^ 
 
 
 
 nt lUnn i ui iui e 
 
 ^■^s^ 
 
 
 i 
 
 i 
 
 TVes 
 
 
 
 
 •^TRANSIT IV E^> 
 
 res ^ 
 
 — -xf 
 
 '^PARSE\ s Yes 
 a NP J> 
 
 
 
 JNo 
 
 
 _jNo 
 
 
 
 s' is if ^s. 
 
 omtrawqitiufS 
 
 No 
 
 
 
 
 v^iiaj i r\MiMoi i ivtL^ 
 
 
 
 
 
 jYes 
 
 41 
 
 
 
 RETURN success 
 
 
 
 
 
 
 Figure 2.4 Flow diagrams for a simple program 
 grammar (from Winograd (1974)) 
 
30 
 
 approach are the inability to handle ungrammatical inputs, and the amount 
 of processing time required for the parse. 
 
 2.2.3 Hybrid Approaches 
 
 It seems only reasonable to try to combine some of the best 
 aspects of both pattern matching and linguistic approaches into a single 
 parser. One such attempt is the dialog understanding process of the 
 SOPHIE CAI system discussed in Section 2.1. This parser is based on the 
 idea of a "semantic grammar" (Burton (1974)), one in which the syntactic 
 categories usually found in a grammar are replaced by concepts which have 
 semantic meaning in the domain of discourse (see Figure 2.5 for an example 
 taken from another application: NLS-SCHOLAR (Grignetti, et al . (1974)). 
 Thus the grammar rules express how each semantic concept may be expressed 
 in terms of constituent concepts. As Figure 2.5 shows, the terminals of 
 such a grammar are patterns of English words. In actual use, the grammar 
 of Figure 2.5 would be hand coded into LISP procedures. Execution of 
 these procedures is controlled by the input string just as Winograd's 
 PROGRAMMAR procedures are. 
 
 Developing such a grammar involves the same analysis of the 
 problem domain and probable inputs as the pattern matching approach does. 
 A semantic grammar can also be made to ignore some words in the input 
 stream or handle ungrammatical inputs. However, since the grammar, once 
 developed, is expressed as LISP procedures, it allows the same ease of 
 interfacing with procedures to query a world knowledge base as Winograd's 
 PROGRAMMAR grammar. This provides a capability to determine pronoun 
 referents and handle ellipses similar to the linguistic approach. 
 Disadvantages are that it is as difficult to extend to other problem 
 
3] 
 
 <REQUEST>:= 
 
 <DEFINE/REQ> 
 
 <WHATIS/REQ> 
 
 <CONTENT/REQ> 
 
 <PARTS-IN-PART/REQ> 
 
 <PARTS-IN-LEVEL/REQ> 
 
 <PROCEDURE/REQ> 
 
 <INSTR/REQ> 
 
 <POSITION/REQ> 
 
 <NLS/ACTION/REQ> 
 
 <DEFINE/REQ>:= 
 
 DEFINE <NOUN> 
 
 WHAT DOES <NOUN> MEAN 
 
 WHAT DOES <NOUN> STAND FOR 
 
 WHAT DOES <NOUN> DO 
 
 <WHATIS/REQ>:= WHAT IS THE PURPOSE OF <NOUN> 
 
 WHAT IS THE CONTENT OF <STR+ADDR> 
 WHAT IS THE LEVEL OF <STR+ADDR> 
 WHAT IS THE ADDRESS OF <STR+ADDR> 
 WHAT ARE EXAMPLES OF <NOUN> 
 WHAT IS THE DEFINITION OF <NOUN> 
 WHAT IS <CURRENT/PART> 
 WHAT IS <STR+ADDR> 
 WHAT ARE <NOUN> 
 
 WHAT ARE <STRUCTURAL> AT <LEVEL/PART> 
 WHAT ARE <STRUCTURAL> IN <FILE/PART> 
 WHAT IS <NOUN> 
 
 **ALSO 'TELL\ME, GIVE\ME, TELL\ME\ABOUT ' IN 
 PLACE OF 'WHAT IS' 
 
 <CONTENT/REQ>:= WHAT <STRUCTURAL> CONTAINS <STRING> 
 
 <PARTS-IN-PART/REQ>:= WHAT <STRUCTURAL> ARE IN <FILE/PART> 
 
 <PARTS-IN-LEVEL/REQ>:= WHAT <STRUCTURAL> ARE AT <LEVEL/PART> 
 
 <PROCEDURE/REQ>:= HOW\DO\I <ACTION/SPEC> 
 
 TELL\ME\HOW\TO <ACTION/SPEC> 
 TELL\ME\ABOUT <ACTION/SPEC> 
 
 <INSTR/REQ>:= WHAT [NLS\COMMANDj <ACTION/SPEC> 
 
 <POSITION/REQ>:= WHERE AM I 
 
 WHERE IS THE CM 
 WHERE IS <STR+ADDR> 
 
 <NLS/ACTION/REQ>:= <ACTION/SPEC> 
 
 DO IT 
 DO <TASK> 
 
 Figure 2.5 BNF description of part 
 of a semantic grammar 
 
32 
 
 areas as the traditional pattern matching approach, yet it requires more 
 processing time during execution. 
 
33 
 
 3. GETTING TO KNOW PATTIE 
 
 3.1 Introduction 
 
 In this chapter we will describe the structure of PATTIE in some 
 detail. In particular, we discuss how various features of the design 
 relate to one another, and how they're involved in the interaction with 
 the student. Some of these features have been influenced by the particular 
 problem PATTIE is concerned with, or by the general area of programming. 
 Others are quite general. 
 
 The three most significant aspects of PATTIE 's design are: 
 
 (1) the use of an AND-OR graph to represent knowledge 
 of both the possible solutions and the processes 
 used to develop those solutions; 
 
 (2) the interaction control routine which traverses this 
 graph, conducts an interactive dialog with the 
 student, and maintains a screen display of the 
 student's developing solution; 
 
 and (3) the existence of a student model, intimately tied 
 in with the AND-OR graph and based on a list of 
 concepts relevant to the problem area. 
 The discussion in the remainder of this chapter is centered on the use of 
 these features to teach top-down programming techniques. However, it must 
 be realized that the techniques used in PATTIE may be adapted to teaching 
 top-down problem solving in any field in which that technique is applicable, 
 
34 
 
 Thus, while the particular graph and concept list used by 
 PATTIE are highly problem specific, such graphs and concept lists may 
 be developed for problems in many different subjects. The interaction 
 control routines are totally independent of the specific problem area 
 PATTIE deals with, namely programming and differentiation. Even portions 
 of the vocabulary used in the natural language understanding routines may 
 be adapted to other subject areas (see Section 3.3.2). 
 
 3.2 Relation to Other Tutors 
 
 PATTIE is only one part of a large system to teach introductory 
 programming skills to students. Her specific task is to illustrate the 
 concepts involved in top-down program development by assisting the student 
 through a dynamic example of such a process, much as a private tutor would. 
 To perform this task properly, PATTIE must be able to understand student- 
 suggested refinements described in English, provide hints based on the 
 current state of the student's proposed solution, and extend that solution 
 if possible and necessary. How does the design of such a tutor relate to 
 those other problem solving tutors with similar abilities discussed in 
 Chapter 2? 
 
 Kimball's and Goldberg's tutors emphasized the need to give the 
 student a wide range of experience, and were capable of handling a 
 correspondingly wide range of both prestored and student-suggested problems, 
 To accomplish this, heuristic problem solving routines, with capabilities 
 similar to those of the students being tutored, were integral parts of the 
 system. Such an approach was possible due to the quantitative nature of 
 the subject areas. Solving problems in integration or proving simple logic 
 theorems requires using only a small number of rules applicable to many 
 
35 
 
 problems (Kimball's tutor used only 12 rules to solve over 70 problems; 
 Goldberg's used 15 rules for over 25 proofs). 
 
 Unfortunately, in top-down programming there is no small set of 
 general rules which can be applied in many situations. There is, instead, 
 a very large number of distinct refinements which are applicable in only a 
 small number of instances. (For example, PATTIE recognizes 61 different 
 refinements applicable to a single sizeable problem. And those 61 
 refinements may be described in English in over 450 ways!) Using this 
 "problem solver" approach in PATTIE would require developing an automatic 
 programming routine which itself used the stepwise refinement method and 
 which could understand and apply such a large number of refinements. Such 
 a routine is vastly beyond current capabilities (see Green, et al . (1974) 
 for a status report on automatic programming systems). 
 
 If the problem solver approach is not viable, perhaps an approach 
 similar to that of Koffman and Blount is possible: develop a collection 
 of specialist routines which can determine the correctness of a portion of 
 a program, and combine their abilities to examine the student's proposed 
 solution to one of a number of predefined problems. For our purposes, 
 there are two flaws in this method. First, this approach is primarily 
 concerned with judging the solution program, and not at all concerned with 
 how the solution is developed. Koffman and Blount, in fact, require the 
 student to go directly from the problem statement to producing code, exactly 
 the procedure PATTIE is trying to combat. Secondly, it would be difficult 
 to implement, as the ability to mechanically judge the correctness of 
 programs written in a high-level language (and correct them when wrong) 
 is very limited (see Sussman (1973) and Goldstein (1974) for discussions 
 of the problems in debugging simple LISP and LOGO programs). 
 
36 
 
 Finally, the problems of trying to simulate the execution of a 
 program consisting of PL/1 statements intermixed with actions described 
 (with varying degrees of precision) in English rule out using the 
 approach taken with SOPHIE. It seems that the very nature of the 
 successive refinement process forbids using the designs used in previous 
 tutors. 
 
 Hence, one of the primary problems faced in developing PATTIE 
 was to provide her with a knowledge structure which exactly described 
 acceptable ways of developing solutions, as well as the acceptable solutions, 
 while simultaneously satisfying the design criteria expected in a good 
 instructional program (good modularity, separation of knowledge and control 
 functions, relative ease of development of the data base, and effective 
 interaction with the student). As with SOPHIE, the detail of knowledge 
 required has forced us to limit PATTIE' s domain of expertise to a single 
 problem. This does not mean, of course, that she is limited to dealing 
 with only one problem forever. But adding additional problems will require 
 a sizeable amount of work to develop the necessary data bases, even though 
 the control program may be used as is. 
 
 The particular problem PATTIE is concerned with is not of vital 
 import, but the problem must meet certain criteria to achieve the desired 
 instructional objectives. First, the problem area should either be one 
 with which most students are familiar, or one which can be easily and 
 clearly explained. Otherwise, students make many errors and waste a good 
 deal of time trying to understand the problem rather than intelligently 
 pursuing a solution. Secondly, the problem must be large enough, relative 
 to the student's problem solving ability, that the student can't grasp the 
 
1/ 
 
 solution immediately. Otherwise, the value of using the top-down solution 
 method to reduce complexity to a manageable degree is not apparent. But it 
 is still necessary that the basic approach to the problem be fairly straight- 
 forward, so that successive refinement is an acceptable solution method. 
 Short but "tricky" problems can easily be found whose solution is not 
 immediately apparent, but then the essence of the solution lies in seeing 
 the trick, rather than in applying a problem solving method. Since a 
 majority of the problems beginning programmers encounter have fairly 
 straightforward solutions, eliminating tricky problems from those PATTIE 
 may be concerned with is not a significant restriction. 
 
 The actual problem PATTIE has been concerned with during the 
 course of this research is that of writing a PL/1 program for taking the 
 symbolic derivative of a sum of terms of a polynomial of two variables. 
 
 3.3 What PATTIE Knows — The Data Bases 
 
 The information PATTIE needs to guide a student to the solution 
 
 i 
 of the problem can be divided into three separate areas: 
 
 (1) knowledge of what the valid solutions to the 
 problem actually are; 
 
 (2) enough knowledge of human discourse in the 
 problem area to understand proposed refinements; 
 
 and (3) information about the student and his performance 
 during the course of solution development. 
 Let's look at each of these in turn. 
 
 3.3.1 The Solution Graph 
 3.3.1.1 AND-OR Graphs 
 
 PATTIE represents knowledge about acceptable solutions to the 
 example problem as an AND-OR graph. Such graphs (like that illustrated in 
 
38 
 
 Figure 3.1(c)) have been used in a number of areas of artificial 
 intelligence research (Nilsson (1971)). 
 
 The basic idea behind an AND-OR graph is reducing a problem 
 to a series of subproblems, just as in stepwise refinement. In such a graph, 
 each node represents a problem. Solving the problem represented by an AND 
 node (Figure 3.1(a)) can be accomplished by solving all the subproblems 
 represented by the successor nodes. Solving the problem represented by an 
 OR node (Figure 3.1(b)) can be accomplished by solving any one of the 
 subproblems represented by the successor nodes. The solution to the 
 initially given problem (represented by the root of the graph) is 
 successively reduced to the solution of sets of subproblems, some of which 
 might be immediately recognized as being solved (LEAF nodes) , others of 
 which might need further reduction. Thus in Figure 3.1(c), solving the 
 complete problem (node R) involves solving node M or_ node N. Solving node 
 M involves solving nodes A and B, and so on. 
 
 In practice, AND-OR graphs are dynamically grown from the root 
 node, using a small number of "problem reduction operators." Clearly, 
 growing such a graph is a trial and error process, complete with deadends 
 and backtracking, which could halt as soon as any solution is discovered. 
 A full graph, such as represented by Figure 3.1(c), would be developed only 
 if all possible solutions were desired. 
 
 It is obvious how such a graph could represent the refinement 
 process. The root represents the initial problem to be solved, leaves 
 represent statements in the target programming language, and each intermediate 
 node represents a subtask on one of the levels of refinement. Thus the 
 process of developing a solution by stepwise refinement is equivalent to 
 tracing a path from the root to some subset of the leaves which represent a 
 
39 
 
 Figure 3.1 Sample AND-OR graph 
 
40 
 
 program to solve the problem. The immediate descendants of any given 
 node represent tasks which are refinements of the task represented by 
 that node. 
 
 Figure 3.2 illustrates a simple example of such task relationships. 
 Intuitively, an OR node corresponds to a point in the development of a 
 solution where a choice must be made between several (equally correct) 
 approaches. An AND node represents a point at which refinement involves 
 several tasks which must all be done to solve the problem. 
 
 For the same reasons that an automatic programming approach was 
 rejected for PATTIE (primarily the large number of possible refinements 
 and the qualitative nature of the successive refinement process), it would 
 be difficult for PATTIE to grow an AND-OR graph as is commonly done. 
 Instead, a "problem expert" supplies PATTIE with a complete AND-OR graph 
 representing all good (well-structured and correct) solutions to the 
 example problem, as well as a few anticipated bad approaches beginning 
 students might attempt. Figure 3.3 shows a small portion of PATTIE' s 
 refinement graph. The complete graph for PATTIE 's problem contains 150 
 nodes and 339 branches. 
 
 The claim of being able to represent all reasonable solutions to 
 a problem, as well as a few wrong approaches, as an AND-OR graph clearly 
 requires some justification. Intuitively, one feels that the number of 
 possible solutions must increase enormously as the complexity of the 
 problem increases. Undoubtedly, this would be true if we were considering 
 every correct solution; however, we are considering only "good" solutions, 
 which makes an important difference. Many of the possible solutions 
 (especially the solutions proposed by beginning students) may be immediately 
 rejected because they violate some obvious criterion of goodness. Thus 
 
41 
 
 
 
 
 
 >-l 
 
 
 
 
 
 01 
 
 
 
 
 
 00 
 
 
 e 
 
 
 
 OJ 
 
 
 o 
 
 
 
 4J 
 
 
 u 
 
 
 
 c 
 
 
 u-t 
 
 
 
 •H 
 
 -C 
 
 
 
 
 
 u 
 
 S-J 
 
 
 II 
 
 * 
 
 « 
 
 o> 
 
 2 
 
 
 
 oi 
 
 oc 
 
 
 01 
 
 OJ 
 
 
 01 
 
 o 
 
 3 
 
 D 
 
 >-i 
 
 ■u 
 
 4J 
 
 ^H 
 
 iH 
 
 
 
 c 
 
 
 « 
 
 Cfl 
 
 u. 
 
 •H 
 
 r-l 
 
 > 
 
 > 
 
 I 
 
 2 
 
 cu 
 
 03 2 
 
 
 
 3 
 
 >> 
 
 
 
 u 
 
 ja 
 
 
 
 iH 
 
 
 
 
 CO 
 
 >^ 
 
 
 
 U 
 
 i-H 
 
 
 
 I Ol 
 
 •H 
 
 
 
 i en 
 
 ■U 
 
 
 
 •H 
 
 .H 
 
 
 
 5 
 
 3 
 
 
 
 U 
 
 6 
 
 
 
 0) 
 
 
 
 
 J= 
 
 T3 
 
 
 
 ■u 
 
 C 
 
 
 
 o 
 
 CO 
 
 e 
 
 
 
 
 O! 
 
 
 
 
 X! 
 
 H 
 
 
 
 4J 
 
 II 
 
 
 
 r— 1 
 II 
 
 Ol 
 
 
 
 2 
 
 3 
 
 
 
 Figure 3.2 Refinement as an AND-0R graph 
 
42 
 
 Figure 3.3 Partial solution graph 
 
4 3 
 
 paths representing inefficient or poorly structured solutions need not 
 be included in the graph, even though such solutions may be correct in 
 a strict sense. At most, such solutions will be one of the plausible 
 wrong approaches. Further, the use of a graph form, as opposed to a 
 tree, allows maximum common utilization of subportions of the graph. 
 These factors significantly reduce the number of nodes the solution 
 graph must contain, and enable us to use such a representation. 
 
 3.3.1.2 Special Features 
 
 In order to use an AND-OR graph as the basis for a tutor of 
 successive refinement, several features were added. As Figure 3.3 shows, 
 branches between nodes are tagged with English phrases ("transition 
 phrases") which are usually descriptions of the tasks represented by the 
 node each branch leads to. Thus, nodes represent subproblems to be solved, 
 and branches are tagged with English descriptions of the subproblems they 
 lead to. 
 
 PATTIE uses these transition phrases to determine the path the 
 student is taking through the graph. Section 3.4 contains a detailed 
 description of how this is done, but the basic idea is simple: at any 
 time during development of the solution, one node in the graph represents 
 the task the student is currently refining. PATTIE matches the refinements 
 suggested by the student against the transition phrases associated with 
 branches leaving that node, and can thus determine the correctness of each 
 refinement and its effect on the solution. 
 
 Other branches (leading to LEAF nodes) are tagged with PL/1 
 statements and represent the final step in refinement of a particular task, 
 namely translation into the target programming language. 
 
44 
 
 Some nodes and branches in Figure 3.3 are tagged with underlined 
 English phrases. These phrases represent one of (currently) 16 programming 
 or problem solving "concepts" which the problem expert may associate with 
 any node or branch. Concepts available are listed in Figure 3.4. The 
 programming concepts indicate major groupings of PL/1 language features, 
 thus specifying the general area of the target language to which a given 
 node or branch is most closely related. The groupings are primarily based 
 on those used in recent introductory programming texts using PL/1 (Conway 
 and Gries (1973), Weinberg, et al . (1973)). The problem solving concepts 
 are similar in intent, but indicate approaches to solving problems in 
 general. They represent the essence of the problem solving techniques 
 recommended in Polya's (1957) classic work, and of suggestions for finding 
 refinements in relation to a top-down approach to program development 
 (Conway and Gries (1973), Wirth (1973)). 
 
 While PATTIE doesn't require that every node and branch be tagged 
 with a concept, these concepts form the basis for her model of the student. 
 Thus the greater the amount of concept tagging done by the problem expert, 
 the greater PATTIE' s ability to aid the student and adapt her expectations 
 to his previous performance. And a small list of concepts is applicable 
 to a wide range of situations; in PATTIE 's refinement graph, 94% of the 
 nodes and 80% of the branches are tagged with one of the 16 concepts. 
 
 The remaining features added to the basic AND-OR graph formalism 
 are special branch or node types, illustrated in Figure 3.5. Special 
 ERROR branches (Figure 3.5(a)) allow the problem expert to provide specific 
 hints to be given to the student only in certain contexts. These ERROR 
 branches may run from either AND or OR nodes, and lead to LEAF nodes which 
 have error messages attached. ERROR branches tagged with transition phrases 
 
45 
 
 programming 
 
 arrays 
 
 assignment statements 
 
 blocks 
 
 built-in functions 
 
 conditional statements 
 
 declarations 
 
 I/O 
 
 loops 
 
 recursion 
 
 subprocedures 
 
 problem solving 
 
 draw a picture 
 
 solve part of the problem 
 
 simplify the original data 
 
 find several distinct subcases 
 
 find a series of sequential actions 
 
 use additional variables 
 
 Figure 3.4 Programming/problem solving concepts 
 
46 
 
 ERROR 
 
 Maybe it will help 
 to divide T into 
 pieces? 
 
 ERROR 
 
 divide T into 
 separate pieces 
 
 2 pieces. 
 
 pieces 
 
 How many pieces 
 will you use? 
 
 4 pieces 
 
 ERROR branches 
 (a) 
 
 Figure 3.5 Refinement graph special features 
 
47 
 
 Mixing PL/1 and English 
 (b) 
 
 TERM: PROC(T,Z) 
 
 END TERM; 
 
 PROC node 
 (c) 
 
 Figure 3.5 Refinement graph special features 
 
48 
 
 ("expected" ERROR branches) correspond to bad approaches the problem 
 expert felt students were likely to attempt at that point, and the error 
 message can explain why that approach is not good. Untagged ("universal") 
 ERROR branches lead to error messages which simply suggest explicit actions 
 which are probably needed at that point. 
 
 A second special branch type was needed to handle the common 
 practice in top-down program development of intermixing partial programming 
 language statements with English descriptions of refinements. For instance, 
 once the need for a DO group has been established, the descriptions of 
 the actions to be done are surrounded by DO and END statements. Thus PATTIE 
 needed a way to both display PL/1 statements and accept refinements at a 
 single node. The solution (illustrated in Figure 3.5(b)) was simply to 
 allow branches to be tagged with PL/1 statements and marked so as to be 
 displayed as soon as the node is encountered, but not traversed. In 
 Figure 3.5(b), then, the DO is displayed as soon as the node is encountered, 
 the refinements are accepted as at any AND node, and the END is then 
 displayed. 
 
 The final special feature is the PROC node shown in Figure 3.5(c), 
 which allows invocation of a subroutine before it is programmed in detail. 
 When encountered in the refinement of one procedure, the PROC node acts 
 like a LEAF, but also causes the interaction control program to stack the 
 PROC as the root of the new procedure's subgraph, for later detailed 
 development. At that time, the branches leaving the PROC node are 
 displayed to provide the initial screen environment for programming the 
 new procedure. 
 
 To summarize, then, there are four types of nodes: ANDs, ORs, 
 LEAFs, and PROCs. A node may be tagged with a concept, associated with a 
 
49 
 
 display message (for example, an error message), or not associated with 
 anything but the branches leaving it. There are three types of branches: 
 regular branches, which are tagged with a transition phrase and perhaps with 
 a concept; ERROR branches, which may or may not be associated with a 
 transition phrase; and immediate display branches, which are tagged with 
 PL/1 statements and are displayed as soon as they are encountered in the 
 traversal . 
 
 3.3.1.3 Advantages 
 
 There are several advantages of representing knowledge of the 
 solution as an AND-OR graph. For one thing, it is compact. PATTIE's 
 refinement graph of 150 nodes requires only 226 60-bit words for storage, 
 an average of less than 1.8 words per node. Error messages and displayable 
 transition phrases require another 700 words of memory, but the overall 
 average is still only 6.4 words per node. 
 
 Another advantage is that it allows separation of specific 
 problem knowledge and knowledge of how to interact with students, which 
 reduces the difficulty of adding new problems to PATTIE's repertoire. 
 For someone who understands and practices top-down programming, developing 
 a refinement graph is not a difficult matter. Essentially, it requires 
 solving the problem and maintaining a trace of the refinements and decisions 
 made. Reexamining that trace for decision points where other approaches 
 are equally valid identifies OR nodes, and additional subgraphs leading 
 from these nodes must be developed. Finally, ERROR branches (and perhaps 
 additional portions of the graph) can be added as a result of observing 
 students using the initial graph. Entering or modifying a graph in PATTIE's 
 data base is a simple (if somewhat time consuming) process using an editor 
 developed for the purpose. 
 
50 
 
 A third advantage is that a single representation can provide 
 knowledge of the student's solution process as well as acceptable 
 solution programs. The solution process is described by a path through 
 the graph, and knowledge of the student's current position on such a 
 path simplifies PATTIE's control program. If the student gets stuck 
 at any point in the refinement process, PATTIE can immediately extend 
 his solution by simply traversing branches leaving the node representing 
 the current task. When a task is ready to be translated into the 
 programming language, a branch leaving the current node will be tagged 
 with a PL/1 statement which performs the required action. If the student 
 is truly lost, PATTIE can let him back up in the graph to a previous 
 decision point and try a different approach. And any acceptable solution 
 program is represented by some subset of the LEAF nodes of the solution 
 graph. 
 
 3.3.2 Natural Language Understanding Features 
 
 Any system which conducts a natural language dialog with a 
 person must be concerned with methods of understanding natural language 
 inputs. In Chapter 4 we discuss aspects of language understanding which 
 determine which technique is best suited to a system such as PATTIE. 
 However, natural language processing was not of prime interest in this 
 thesis, beyond the need to provide PATTIE with enough understanding to 
 allow for fairly natural description of refinements. The PLATO IV author 
 language, TUTOR, provides a facility for natural language understanding 
 (Tenczar and Golden (1972)) which PATTIE uses and which has proven adequate 
 for her needs. 
 
 Essentially, this facility is a keyword recognition, pattern 
 matching scheme. An author specifies a vocabulary, consisting of a number 
 
 i 
 
51 
 
 of disjoint classes of synonymous words (groups of "content" words) and 
 a list of words which are allowed in a student's inputs but which carry 
 no meaning ("ignorable" words). Elements of a synonym class may be 
 single words or "phrases," which are a series of two or more words which 
 must appear contiguously. Phrases provide a simple means of handling 
 common idioms, and may consist of ignorable words, content words appearing 
 elsewhere in the vocabulary, or completely new words. 
 
 The purpose of a language understanding system is to assign a 
 meaning to the typed inputs. TUTOR'S facility attempts to assign a meaning 
 by matching the input against a series of stored patterns. Each pattern 
 consists of representatives from one or more of the classes of synonymous 
 words in the vocabulary. Since there are usually many ways of expressing 
 an idea in natural language, it is frequently necessary to attach more than 
 one keyword pattern to a single "meaning list." For example, since keyword 
 order and number of keywords are important in a pattern match, if it was 
 desired to assign the same meaning to the inputs "a brown cat," "a cat 
 that is brown," and "a cat" (assuming "cat" and "brown" are content words 
 and other words are ignorable), the meaning list must include. the patterns 
 "brown cat," "cat brown," and "cat." 
 
 Figure 3.6 illustrates this v/ith a small portion of PATTIE's 
 vocabulary and a few sample meaning lists. PATTIE's vocabulary contains 
 484 root words and phrases, and over 700 words if different endings are 
 considered. There are 61 separate meaning lists containing a total of 
 452 patterns. 
 
 Analyzing a student input is then a straightforward process, 
 performed by the PLATO IV system. Keywords are extracted, the list of 
 patterns searched, and the first matched pattern (if any) is identified. 
 
52 
 
 vocaoulary 
 
 * ignorable words 
 
 <a, an, and, another, by, from, in, is, new, of, set , the, to> 
 
 * content words 
 
 (al locate, create, DCL, declare, del .define) 
 
 (i f , IF, 1 f *then, IF*THEN, 1 f*ther.*else, IF*THEN*ELSE) 
 
 (var 1 ab 1 e , var , temporary , storage) 
 
 (asterisk, mult ipl icat ion*sign,star) 
 
 (aster i sks , doub 1 e#star , exponent i at i on*s i gn) 
 
 (coe f f i c i ent , coe f , coe f f i c i ent * st r i ng) 
 
 (equa 1 , equa 1 s , equ i va 1 ent , same* as) 
 
 (exponent , exp, exponent *of*z ,exp*of*z) 
 
 (expression, string, term, T,character#str ing#T) 
 
 (decrease, decrement , deduct , reduce, subtract) 
 
 (delete, el iminate, get.*r id*of , remove, separate, spl it) 
 
 ( f i nd*product , mu 1 1 i p 1 y , product , t i mes , t ake*product ) 
 
 (one, 1) 
 
 * meaning lists 
 meaning del 
 
 dc 1 var 
 meaning coef term 
 
 remove coe f 
 
 remove coef term 
 mean i ng mu 1 1 1 p 1 y coe f exp 
 
 mu 1 1 i p 1 y coe f -t i mes exp 
 
 mu 1 1 i p 1 y exp coe f 
 
 mu 1 1 i p 1 y exp 1 1 me 3 coe f 
 
 coe f equa 1 coe f t i mes evp 
 
 coef equal evp times coef 
 
 coef product coef e>p 
 
 coe f product coe f t i mes exp 
 
 coef equal product coef exp 
 
 coe f equa 1 product coe f t i mes exp 
 
 coef product exp coef 
 
 coe f product exp t i mes coe f 
 
 coef equal product exp coef 
 
 coef equal product exp times coef 
 meaning reduce exp 
 
 subtract 1 exp 
 
 reduce exp 1 
 
 Figure 3.6 Sample vocabulary and meaning lists 
 
53 
 
 In cases where there is no match found, the student answer is marked up 
 to indicate broken phrases, words not in the vocabulary, and suspected 
 misspellings, and the input must be retyped. 
 
 This understanding system is incorporated into PATTIE as follows. 
 The problem expert develops a vocabulary containing words he anticipates 
 students will use in the dialog. The transition phrases associated with 
 branches in the graph (the refinements) are determined, patterns are 
 established to accept each phrase or its equivalent (synonymous equivalents 
 as a result of the vocabulary, other equivalents through multiple patterns 
 for the same meaning), and the number of that meaning list is attached to 
 the corresponding branch in the graph. Vocabulary and patterns can be 
 improved by iteratively examining protocols of student interaction with 
 the system and modifying both patterns and vocabulary to correct misunder- 
 standings. 
 
 PATTIE 's actual implementation of these features uses only a 
 single vocabulary for the entire problem domain, but a separate set of 
 meaning lists for each procedure required in the solution program, similar 
 to an approach taken in later versions of ELIZA (Weizenbaum (1967)). This 
 reduces the difficulty of insuring that two patterns associated with 
 separate meaning lists don't match the same input. 
 
 One of the biggest drawbacks of a synonym-class approach such as 
 this is that a word can have several different meanings in different 
 contexts. Since no word can be in two classes (except as part of a 
 phrase), classes which contain the same words must be coalesced. This 
 introduces a certain amount of ambiguity into the meaning attached to some 
 inputs. Fortunately, a node in the refinement graph provides a well-defined 
 
 context which helps reduce this ambiguity caused by merged classes. The most 
 
54 
 
 likely student inputs at a node are exactly those which correspond to 
 transition phrases tagged to branches leaving that node, or nodes 
 slightly lower in the graph. Therefore, if an input matches one of these 
 branches, there is a high probability that the intended meanings are the 
 same . 
 
 As might be expected, a difficult part of adding a problem to 
 PATTIE's repertoire is developing the vocabulary and meaning lists needed 
 to understand student inputs. As mentioned above, this is a trial and 
 error process involving real students, although the effort involved may 
 be decreased depending on how closely related the new problem is to the old. 
 For example, about 27% of PATTIE's present vocabulary is concerned with 
 differentiation, another 23% with programming and PL/1, and another 11% 
 is concerned with polynomials. 
 
 3.3.3 The Student Model 
 
 One of the prime arguments which proponents of CAI make for 
 using computers in education is that CAI systems can adapt themselves to 
 provide instruction tailored to each individual student. Such 
 individualized instruction, however, is seldom realized in practice. 
 PATTIE makes no claim to provide individualized tutoring for every 
 student, but she does make an attempt to adapt her presentation based on 
 the student's past performance. This adaptation is controlled by a simple 
 student model which is based on the programming and problem solving concepts 
 discussed in Section 3.3.1.2 and listed in Figure 3.4. 
 
 The typical student model employed by CAI systems is a vector 
 representing the student's performance in some fixed number of areas, which 
 are commonly the system's educational objectives. The student is presented 
 
55 
 
 information or asked questions in one particular area until his performance 
 exceeds some threshold value, at which point the next topic is introduced. 
 
 Only a few attempts have been made to develop and use a strong, 
 effective model of the student. One of these is that of Kimball, in his 
 tutor for methods of integration. His model is probabilistic, holding 
 that an attempted solution can be modeled as a sequence of problem states, 
 and that the student can be modeled by a set of transition probabilities 
 from one state to any other. The possible problem states are dependent on 
 the subject being tutored, and the proper identification of these states 
 is a critical step in the development of such a tutor. 
 
 PATTIE's educational goal is to expose the student to a dynamic 
 example of program development by successive refinement, and thus show 
 him how he can use such a process in developing any program. The degree 
 to which this is achieved is not easily measured by a threshold level. 
 However, PATTIE's entire refinement graph may be viewed as a sequence of 
 problem states in the sense used by Kimball. Each node is a state, 
 characterized by the concepts attached to it and to the branches which 
 leave it. Probabilities are established for each concept, and in turn the 
 concepts characterizing a node provide a probability of the student 
 correctly describing the refinements required at that node. 
 
 Say that a student correctly "employs" a concept whenever he 
 
 (1) correctly describes all refinements needed at a 
 node tagged with that concept, 
 or (2) correctly describes a single refinement associated 
 with a branch tagged with that concept. 
 The heart of the student model is then a pair of counters for each of the 
 16 concepts, one counting the number of times the student has correctly 
 
56 
 
 employed that concept, the other counting the number of opportunities he 
 had to do so. This establishes a probability of correct employment 
 for each concept. As will be described in Section 3.4.3, the interaction 
 control routine can use this correct employment probability to suggest 
 which branches the student should follow at choice points, or to decide 
 to simply display the necessary refinements at a node instead of asking 
 for them. 
 
 There are two other components of the model beyond the correct 
 employment probabilities described above. PATTIE keeps track of the 
 average number of levels in the graph spanned by each student input, and 
 uses this information to control how far down in the graph to search in 
 trying to match a student input (see Section 3.4.3.1). She also keeps 
 track of the average number of refinement attempts made for all nodes, 
 and uses this information in the interaction control routine as well. 
 
 3.4 How PATTIE Behaves — The Interaction 
 
 As has been mentioned, PATTIE' s basic model of the problem 
 solving process is the method of stepwise refinement, similar to that of 
 Wirth (1971). In relation to the AND-OR graph, the successive refinement 
 process corresponds to tracing a path through the graph from a root to 
 some subset of the LEAF nodes which represent a solution program. The 
 exact path taken is determined by suggested refinements input by the 
 student. At any given time, the student is actively refining only one 
 task, the "current task.' 1 This current task is represented in the 
 refinement graph by a single node, which PATTIE considers the "current 
 node." PATTIE determines the correctness of student-suggested refinements 
 by matching them against branches in the graph at and below the current 
 node. Once the student has described all the actions needed to refine the 
 
57 
 
 current task, a new current task is selected by simply traversing one of 
 the described branches from the current node to a new current node. The 
 order in which these branches are traversed is determined by the control 
 program using a depth-first traversal algorithm. 
 
 It is interesting to note that, besides being easy to implement 
 as a control algorithm, there is some evidence that such a depth-first 
 design procedure is used in the development of actual software packages. 
 Ells and Freeman (1973) reviewed (after the fact) the logical and chronological 
 relationships in the design decisions made during development of three 
 separate BASIC translators. Figure 3.7 shows design trees they developed 
 for each translator by ordering the design decisions so that no decision was 
 logically dependent on other decisions beneath it in the tree. These 
 structures are clearly AND-OR trees with only one branch from any OR node 
 fully developed. The order in which the decisions were made is indicated 
 by the numbers at each node of the tree. As Figure 3.7 shows, this numbering 
 is clearly depth-first in two of the trees, somewhat less so in the other. 
 It is also interesting that these, crees contain more OR nodes than ANDs, while 
 PATTIE's solution graph shows the opposite mixture. Possibly this is because 
 the design trees of Figure 3.7 reflect higher level decisions of large 
 projects, while PATTIE's graph indicates detailed decisions of a considerably 
 smaller program. 
 
 The remaining subsections of this chapter describe bow PATTIE 
 handles interaction with the student. The reader may find it helpful to 
 refer back to the sample dialog of Section 1.3 for examples of use of 
 various features. 
 
58 
 
 
 /\ 
 
 
 
 'X 
 
 -^ /\ 
 
 * 
 
 A Ar xa, m a 
 
 A / V\X | VVX X s 
 
 Figure 3.7 Design trees (from Ells 
 and Freeman (1973)) 
 
59 
 
 3.4.1 What the Student Sees — The Screen Display 
 
 The only medium through which PATTIE may talk to the student is 
 the display screen of the PLATO IV terminal, a 64 character by 32 line 
 plasma panel. Within this area PATTIE must conduct a dialog and maintain 
 a copy of the student's developing solution. Figure 3.8 is a copy of the 
 screen as the student sees it. The upper 20 lines are the "program area" 
 and contain the developing solution program. The lower part of the screen is 
 the "scratchpad," the area where the dialog is conducted. 
 
 On the left-hand side of the program area are a series of "task 
 names" indicating the relationship of each task to others in the solution, 
 exactly as the relationships between sections of this thesis are described 
 by the section numbers. Task names are assigned when the refinement task 
 is initially described, based on the task name of the current node and the 
 current node type. Each refinement at an AND node receives a task name 
 composed of the task name of the AND and a suffix indicating the number of 
 the branch matched by the refinement. At OR nodes, on the other hand, the 
 task name of the refinement is simply that of the OR itself, since only one 
 branch leaving the node is ever traversed. 
 
 Traditionally, indentation has been used to indicate the relationship 
 between various tasks described in the successive refinement process (Gries 
 (1974)). On a screen with limited horizontal and vertical extent, use of 
 task names performs the same function, saves room on the screen, and provides 
 a handy way to refer to specific tasks and describe their interrelationships. 
 Task names also describe the order of refinement of the tasks (and hence of 
 traversal of the refinement graph), namely in increasing order of the task 
 name s . 
 
 Note also that the limited extent of the program area forces a 
 great degree of modularity in the structure of the solution program. The 
 
60 
 
 DIFTEPM: PPOC'CT.Z) ; 
 DCL(T,Z,BFOR ) CHHRj 
 
 1.1 DCL (NDXZ.LSTHR ) FIXED; 
 NDXZ = INDEX(T,Z) ; 
 
 1.2 IF NDXZ = 
 
 THEN RETURN ( ' ' ) ; 
 2.1.2 BFOR =SUDSTR<r,NDXZ +2); 
 2.2.2 LSTHR = INDEX CSUBSTR (T , NDXZ + 3) , ' * ' ) j 
 
 2.3 separate the exp from the rest of the string 
 
 2.3.1 IF LSTflR =0 
 
 2.3.2 THEN exp is the rest of the string 
 ELSE 
 
 2.4 simplify the parts and return the answer 
 END DIFTERM; 
 
 now refining task 2.3 
 
 What else must be done to re tine the current task? 
 
 HELP n&w available if wanted 
 
 Figure 3.8 The student's screen display 
 
 
61 
 
 longest any procedure can be is 20 lines of code, and though PATTIE 
 practices various space-saving techniques (such as merging the 
 declaration statement of a new variable with previously declared 
 variables of the same type), good modularity is still emphasized. 
 
 Figure 3.8 shows three distinct subareas within the program 
 area. At the very top are PL/1 statements, corresponding to refinements 
 which had been described well enough to be translated and displayed as 
 code by PATTIE. Immediately below these is the English description of the 
 current task. Finally, at the very bottom of the program area are other 
 tasks awaiting further refinement. This is essentially a stack of 
 refinements described at AND nodes during solution development, but whose 
 corresponding branches have not yet been traversed by the control program. 
 Note the mixture of PL/1 and English in task descriptions on the program area 
 stack of Figure 3.8. 
 
 The scratchpad area is where PATTIE accepts student inputs, 
 displays hints, or reveals the anticipated refinements once available hints 
 have been exhausted. This interaction goes on (solely in the scratchpad 
 area) until a correct refinement for the current node is input by the student, 
 At this point, the refinement task description is moved to the program area, 
 the exact location depending on the current node type, as will be described 
 in the next section. 
 
 It should be noted that the refinement descriptions moved to the 
 program area are not identically those input by the student. Rather, they 
 are paraphrases supplied by PATTIE, a display of the transition phrase 
 associated with the branch matched by the student input. This paraphrasing 
 action was included as a result of experience with student use of the system, 
 and has two objectives. First, because of the need to coalesce vocabulary 
 
62 
 
 word classes as discussed in Section 3.3.2, there is a possibility of 
 erroneous interpretation of the student's input by PATTIE's language 
 understanding facility. The paraphrasing action allows the student to 
 compare what he typed to what PATTIE displayed, and take any differences 
 in intent into consideration in his subsequent refinements. Also, this 
 paraphrasing provides the student a chance to learn what types of inputs 
 PATTIE expects, and modify his refinement descriptions to achieve better 
 understanding. 
 
 3.4.2 Interaction — The Ideal Case 
 
 This section and the next discuss how PATTIE uses her data bases 
 during interaction with the student. Figure 3.9 presents a flow diagram 
 of the interaction control program. The behavior of this control program 
 depends on the current node type. To simplify the discussion somewhat, 
 let's initially assume that the student's inputs correspond exactly to the 
 refinements expected at the current node. 
 
 If that current node is an OR, the student only needs to input 
 a single refinement which matches one of the branches leaving the node, and 
 the control program's actions are correspondingly simple. It first must 
 determine which branch leaving the node is matched by the student input, 
 by comparing the number of the meaning list matched by the input to the 
 number attached to each branch. Next, it must update the student model 
 based on the input. If either the current node or the matched branch are 
 tagged with a concept, the correct employment and number of opportunity 
 counters are both increased by one. This increases PATTIE's estimate of 
 the probability of the student correctly employing this concept the next 
 time it is encountered. At the same time, the information on the number of 
 levels and average tries per node is updated. Finally, PATTIE traverses 
 
63 
 
 Figure 3.9 Flow diagram of interaction control routine 
 
64 
 
 the matched branch to a new current node, and awaits student refinement 
 of the corresponding task. 
 
 Figure 3.10 shows how the screen display reflects the action of 
 the control program. Figure 3.10(a) shows the screen when the suggested 
 refinement is accepted, and Figure 3.10(b) shows the screen after traversal 
 to the new current node. Notice PATTIE's use of a paraphrase of the student 
 input, and how that paraphrase immediately replaces the current task 
 description on the screen. Also note that the task name of the current 
 task has not been changed as a result of accepting this refinement. 
 
 If, on the other hand, the current node is an AND, the action 
 of the control routine is a little more complicated. AND nodes correspond 
 to points in the solution process where refining a task requires describing 
 several subtasks whose actions together equal the action of the current 
 task. To insure that the student's understanding of the developing solution 
 is always correct, PATTIE requires that refinements corresponding to every 
 branch leaving an AND node be input before the traversal proceeds past that 
 node (as was indicated in the sample dialog of Chapter 1). As each branch 
 is described, the correct employment probability for its attached concept 
 (if any) is increased as described above. Once all the branches leaving 
 the node have been so described, the correct employment probability for any 
 concept attached to the node itself and the average number of attempts at 
 nodes are updated. Then the traversal continues by placing the current 
 node on a stack of active AND nodes and the first (that is, the leftmost) 
 branch is followed to a new current node. An AND node is considered "active" 
 whenever all the refinements required by that node have been described by 
 the student, but at least one branch leaving the node has not been traversed. 
 
 Figure 3.11 shows how the screen reflects the control program 
 action at an AND node. Figure 3.11(a) depicts the screen as the last 
 
65 
 
 1. 1 
 1.2 
 2 
 
 difterm: proc(t,z) ; 
 dcl(t,z) char; 
 dcl ndxz fixed; 
 ndxz = index (t,z) ; 
 if ndxz =0 
 then return ( ' ' ) ; 
 
 consider the general case 
 
 END DIFTERN: 
 
 noui refining; task 2 
 
 Tell me how you intend to proceed 
 
 divide T before and after the exponent 
 ok 
 
 press -NEXT- 
 
 during acceptance 
 (a) 
 
 Figure 3.10 Screen dynamics at an OR node 
 
66 
 
 DIFTERM: PROC(T,Z) ; 
 DCL(T,Z) CHAR; 
 
 1.1 DCL NDXZ FIXED; 
 NDXZ =INDEX(T,Z) ; 
 
 1.2 IF NDXZ =0 
 
 THEN RETURN ( ' ' ) ; 
 2 divide T into 3 parts 
 
 END DIFTERM; 
 
 now refining task 2 
 
 Tell me how you. intend to proceed 
 
 HELP now available if wanted 
 
 after traversal 
 (b) 
 
 Figure 3.10 Screen dynamics at an OR node 
 
67 
 
 1.1.1 
 1.1.2 
 1 .2 
 
 TERM: PROC(T,Z) ; 
 DCL(T.Z) CHAR; 
 DCL NDXZ FIXED; 
 NDXZ *INDEX(T,Z) ; 
 IF NDXZ =0 
 THEN RETURN ( ' ' ) j 
 divide T into 3 pi 
 
 2.3 
 2. 4 
 
 save the string in front of the exp 
 
 separate the exp from the rest of the string 
 simplify the parts and return the answer 
 END TERM; 
 
 now re f l n i ng t a sk 
 
 What else must be 'done to refine the current task'"' 
 
 check if z is the 2nd variable 
 ok 
 
 Do you. want to give this task ^v\ identifying letter'"' 
 
 during acceptance 
 (a) 
 
 Figure 3.11 Screen dynamics at an AND node 
 
68 
 
 1.1.1 
 1.1.2 
 1 .2 
 
 2. 1 
 
 TERM: PROC(T,Z) ; 
 
 DCL(T,Z) CHAR; 
 
 DCL NDXZ FIXED; 
 
 NDXZ =INDEX(T,Z) ; 
 
 IF NDXZ =fl 
 
 THEN RETURN ( ' ' ) ; 
 
 save the string in front 
 
 :/f the exp 
 
 2. 4 
 
 see if there is a * after Z 
 
 separate the exp from the rest of the string 
 simpl'ify the parts and return the answer 
 END TERN; 
 
 now refining task 2.1 
 
 Tel 1 me how you. intend to proceed 
 
 r. 
 
 after traversal 
 (b) 
 
 Figure 3.11 Screen dynanicc at an AND node 
 
69 
 
 required refinement is accepted. As each refinement is accepted, its 
 description is moved to the program area stack. In this case, there were 
 four branches leaving the current node. Notice that the refinements are 
 placed on the stack in the order in which the branches leave the node 
 (left to right) and blank lines are left in the stack when the student 
 enters refinements out of this order. This placement aids the student by 
 indicating the relation of missing refinements to those he has already 
 described. Once all refinements have been described, the top task on the 
 program area stack is simply moved up to become the current task 
 (Figure 3.11(b)), corresponding to traversal of the leftmost branch to 
 a new current node, and the refinement process continues. 
 
 If a traversal from the current node results in a LEAF being 
 selected as the new current node, the top node is popped from the active 
 AND node stack, the leftmost untraversed branch from the popped node is 
 traversed to a new current node, and the popped node is restacked. In the 
 sample dialog of Chapter 1, those points where PATTIE announced a task was 
 completed and selected a new task for further refinement correspond to 
 points where this popping action occurred. If all branches leaving the 
 popped node have been traversed, the stack is simply popped again. The 
 effect on the screen display of popping the active AND stack is to just 
 move the task at the top of the program area stack up to become the new 
 current task. Thus the student's screen always shows a partially developed 
 solution which is an exact reflection of the status of the depth-first 
 traversal of the solution graph. The refinements stacked at the bottom 
 of the program area provide a context in which the student can devise his 
 refinements for the current task, but they needn't yet be considered in 
 detail. 
 
70 
 
 If a PROC node becomes the new current node, the effect is the 
 same as for a LEAP", with the addition of stacking the PROC node on an 
 active PROC stack for later consideration. Encountering a LEAF when the 
 active AND stack is empty means the current procedure has been completely 
 programmed, and the root node for a new procedure is popped off the active 
 PROC stack. If that stack is also empty, the whole problem has been 
 completed . 
 
 Note that the left to right, depth-first traversal of the 
 solution graph effectively places an order on the branches. Thus, each 
 AND node in PATTIE's graph is what might be called an "ordered" AND node, 
 one where the sequence in which the branches leaving it are traversed is 
 important. There are many instances in any computer program when the order 
 in which actions are performed is vital. But there also exist instances 
 when the order in which actions are performed is immaterial: certain 
 assignment statements, for example, or the order of execution of the clauses 
 in an IF-THEN-ELSE construct, as long as the condition is properly negated. 
 The use of such "unordered" ANDs in the solution graph would allow the 
 student to choose the order in which tasks are further refined at those 
 nodes, a situation more closely resembling actual program development than 
 the rigid order PATTIE enforces. The cost of this extension would be more 
 complicated dialog control and screen management routines, and perhaps the 
 need to modify the solution graph based on the student's input. These 
 problems precluded the implementation of such a feature in PATTIE, but it 
 remains an interesting extension possibility. PATTIE presently allows the 
 student to control the order of clauses at an IF statement by simply splitting 
 off two subgraphs from an OR node (Figure 3.12). This is effective, but 
 does increase the size of the graph. 
 
71 
 
 Figure 3.12 Ordering clauses in an IF statement 
 
72 
 
 One last point, before we look at what happens when the student 
 makes a mistake. The developers of several of the problem solving tutors 
 discussed in Chapter 2 (Kimball, Goldberg, Brown and Burton) emphasized 
 the importance of allowing the student to concentrate on the problem 
 solving method, rather than being bogged down in a myriad of detail. 
 PATTIE was designed according to this same philosophy. For example, to 
 eliminate the need for the student to worry about exact syntax of 
 language features, PATTIE displays the PL/1 statements accomplishing a 
 given task as soon as the task has been sufficiently defined. In practice 
 this means that any branch tagged with a PL/1 statement is never traversed, 
 but instead the PL/1 statement is displayed and traversal continues with 
 the next branch in order. 
 
 3.4.3 Handling Errors 
 
 Of course, not all student inputs are going to be correct, so 
 PATTIE must contain provisions for handling and correcting student errors. 
 Some of these provisions are contained directly in the refinement graph, 
 others are in the control structure of the dialog routines. The basic 
 principle is always immediate correction of any errors found. While there 
 is undoubtedly a great deal of educational value in letting a student make 
 mistakes and follow a wrong path until he discovers the error himself, students 
 get a great deal of such experience on their own in solving homework 
 assignments and machine problems. There is also a great deal of educational 
 value in correcting a student's errors immediately, when the reasons he made 
 the wrong decision are fresh in his mind and can most easily be related to 
 the correct decision. It was this approach which was implemented in PATTIE' s 
 control structure as a direct counterpart to the unaided problem solving 
 situation. 
 
73 
 
 In addition to the error correction and hint facilities discussed 
 in this section, PATTIE also provides several traditional CAI lessons 
 which explain differentiation rules, describe the problem, give suggestions 
 for finding refinements, etc. The student can access these from several 
 points of the refinement dialog and then return to his solution where he 
 left off. 
 
 3.4.3.1 Lookahead 
 
 The first problem in dealing with error is to determine whether 
 or not an error has occurred. Certainly, it is somehwat unlikely that 
 successive students inputs will follow exactly the sequence expected by 
 the solution graph. At any given node, some of the meaningful (correct) 
 student inputs will match branches leaving that node, some will coalesce 
 several levels of the graph into a single input, and some will match single 
 branches at a greater depth into the graph. 
 
 As a simple example, it is possible that the concept expressed by 
 the graph in Figure 3.13(a) could equally well be expressed as in Figure 3.13(b), 
 without the intermediate transition represented by B. To cover both 
 possibilities, the graph should actually look like Figure 3.13(c). This 
 increases the total number of branches leaving the upper node, and the problem 
 is greatly compounded when additional levels are considered. 
 
 Inclusion of all such possible student inputs in the solution graph 
 would cause a combinatorial increase in the number of branches leaving each 
 node, and make development of the graph a much more difficult task than it 
 currently is. On the other hand, a sophisticated natural language under- 
 standing algorithm might be able to determine the correctness of student 
 inputs spanning multiple levels, using inferential techniques, at the expense 
 
74 
 
 (a) 
 
 (b) 
 
 (c) 
 
 Figure 3.13 Need for additional branches 
 
75 
 
 of additional CPU usage, slower response times, and expanded knowledge 
 requirements. Instead of taking either approach, we have chosen to stick 
 with the simple (but fast) understanding system available in TUTOR, and 
 include in the solution graph only the most likely student inputs, based 
 on observation of students solving the problem. If an input doesn't match 
 any of the branches leaving the current node, PATTIE performs a depth-first 
 search from that point, attempting to match the input to a branch further 
 down the graph. The depth of this search is determined by the student model 
 information (i.e., depth is a constant multiple of the average number of 
 nodes spanned by previous inputs) . There is a slight problem with this 
 "lookahead" approach also, in that the context provided by the graph at 
 the lower level is not exactly the same as that provided by the current 
 node, and there is thus a slightly greater chance of misunderstanding the 
 input. In practice, however, this has not proven to be a significant 
 problem. 
 
 If the student input does match a branch further down the graph, 
 PATTIE' s response depends on the types of nodes lying on the path from the 
 current node to the matched branch. If the trace runs through only OR 
 nodes, as in Figure 3.14(a), the current node is considered to be satisfied, 
 and the new current node becomes that at the end of the matched branch (node 
 4 in Figure 3.14(a)). On the other hand, if AND nodes intervene on the trace 
 (Figure 3.14(b)) or the trace begins with an AND node, the uppermost AND is 
 taken as the current node, ORs preceding the first AND are considered 
 satisfied, and the student is told the relationship between the current 
 node and the branch his input matched: 
 
 "We are refining task 3. You have described task 3.2.3.2. 
 
 What might task 3.2 be?" 
 
76 
 
 current node 
 
 new current node 
 
 matched branch 
 
 (a) 
 
 current node 
 
 new current node 
 
 matched branch 
 
 (b) 
 
 Figure 3.14 Correct matches during lookahead 
 
77 
 
 This lookahead machinery solves a small problem inherent in 
 using an AND-OR graph to model the refinement process. Situations 
 frequently occur when a node combining both AND and OR node properties 
 is needed, as in Figure 3.15(a). The standard solution is to add an extra 
 OR node to the graph, as in Figure 3.15(b). The problem is that this 
 introduces superfluous refinements in the solution graph. However, the 
 lookahead mechanism overcomes this problem. If the current node is the 
 OR of Figure 3.15(b), and the student input matches the originally 
 anticipated A or B, the AND becomes the new current node, and the inserted 
 OR node is essentially transparent. 
 
 3.4.3.2 Hint Structure 
 
 Once PATTIE determines that a student input is a mistake, the 
 
 error correction machinery takes over. There are several levels of prompts 
 
 coded into the dialog routines which provide slight hints to the student. 
 
 These are dependent on the current node type and may be superseded by more 
 
 explicit hints contained in the solution graph, if such are available. The 
 
 most general level is simply the universal prompt from the dialog of 
 
 Chapter 1: 
 
 "Tell me how you intend to proceed." 
 
 At an AND node, subsequent standard prompts are 
 
 "Try to break the problem into simpler subproblems. " 
 "My solution breaks the problem into subproblems." 
 
 while at an OR node they are 
 
 "Think of an action to narrow the scope of the problem." 
 "My solution may do this task in one of ways." 
 
 The use of the number of possible choices in the last prompt seems to help 
 
 students focus their thinking. 
 
78 
 
 desired graph 
 (a) 
 
 actual graph 
 (b) 
 
 Figure 3.15 Inserting additional OR nodes 
 
79 
 
 If a correct response is not obtained following the last prompt, 
 the refinements anticipated by the current node are displayed. If the 
 current node is an OR, PATTIE displays all the branches leaving the node at 
 once, and asks the student which approach he would like to pursue. If 
 the branches are tagged with concepts, PATTIE recommends he follow that 
 branch tagged with the concept with the greatest probability of correct 
 employment. At an AND node, branches are displayed one by one, in the 
 order they will be traversed. After each branch is displayed, the student 
 is given a chance to suggest one of the remaining refinements, in the hope 
 that seeing one or two of the needed refinements will allow him to figure 
 out what the remaining ones should be. For either type of node, concepts 
 which may be associated with missed branches or the missed node have only 
 their number of opportunity counters increased, so as to lower their correct 
 employment probabilities. 
 
 PATTIE uses the student model concepts to provide hints, as well. 
 If the current node has been tagged with a concept, PATTIE displays a hint 
 directly related to that concept, instead of the second prompt in the 
 standard sequence. 
 
 Finally, the problem expert may provide very explicit hints by 
 means of ERROR branches in the solution graph. If an expected ERROR branch 
 leaves a given node, the error message the branch leads to will be displayed 
 only if a student input matches the attached transition phrase when that 
 node is the current node. ERROR branches are not examined during the 
 lookahead matching process, to avoid potentially misleading hints. If the 
 branch is a universal ERROR branch, the error message is displayed in response 
 to the first wrong input received when that node is the current node, and 
 then the prompt sequence described above takes over. It is possible to have 
 
80 
 
 an arbitrary number of explicit ERROR branches leaving any node, but only 
 a single universal ERROR branch. This ERROR branch facility allows the 
 problem expert to provide explicit corrections at points where he 
 anticipates large numbers of students will make the same mistake, while 
 he may rely on the standard prompts and (most helpfully) concept hints 
 most of the time. 
 
 Let's look at a short piece of dialog to show how these various 
 error handling facilities are used in PATTIE's interaction with the 
 student. Suppose the current node is the one shown in Figure 3.16. 
 
 PATTIE: Tell me how you intend to proceed. 
 
 [The standard prompt sequence, level one] 
 Student: I don't know 
 
 [or any wrong answer] 
 P: How can you make T easier to differentiate? 
 
 [Universal ERROR branches have higher 
 
 precedence than anything except matched 
 
 expected ERROR branches.] 
 S: I don't know 
 P: It would help to simplify the original data. 
 
 [The concept related hint in place of the 
 
 second-level standard prompt] 
 S: I want to break up T into separate parts 
 
 [ This matches the expected ERROR branch] 
 P: How many pieces to you want to use? 
 
 [ This error message would have been displayed 
 
 whenever the expected ERROR branch had been 
 
 matched ] 
 
81 
 
 simplify 
 data 
 
 ERROR 
 
 How can you make 
 T easier to 
 differentiate? 
 
 divide 
 3 Pi 
 
 divide ' nto 
 4 jSiec 
 
 ERROR 
 
 J)reak T into 
 pieces 
 
 How many pieces 
 do you want to 
 use? 
 
 T into 
 ices 
 
 d iv id e 
 2 pieci 
 
 into 
 is 
 
 Figure 3.16 Sample node with error facilities 
 
82 
 
 S: Three 
 P: OK 
 
 If the student had not properly responded to the message displayed as a 
 result of matching the expected ERROR branch, the third standard prompt 
 would have been displayed, and then the whole node. 
 
 There are two remaining means of dealing with student errors. 
 PATTIE uses the student model to decide when to change the standard dialog 
 sequence of asking for refinements and judging student inputs. If the 
 model indicates that the student's probability of correctly specifying the 
 refinements required at the current node is very low, PATTIE simply displays 
 the needed refinements, rather than asking the student to describe them. 
 "Very low" probability may be achieved in one of three ways: 
 
 (1) the probability of the student correctly employing 
 the concept attached to the current node must be 
 less than 20%; 
 
 (2) for each branch leaving the current node, the 
 probability of correctly employing the concept 
 attached to that branch must be less than 30%. 
 
 or (3) the average number of attempts per node must be 
 greater than 2.75 (3 is maximum). 
 The actual limits are somewhat arbitrarily chosen parameters and will 
 probably require change based on further experience. This node display 
 mechanism is similar to Kimball's slave mode, designed to allow a student 
 who is doing poorly to see the refinements needed at the current node without 
 the obscuring actions of suggesting refinements which are likely to be wrong. 
 Following such a display, the correct employment probabilities of the 
 
83 
 
 concepts involved are increased, although not as much as they would have 
 been had the student described the refinements correctly. 
 
 The final method of dealing with student errors is invoked by 
 the student himself. If he has gotten into a section of the graph where 
 he doesn't understand the approach being taken to a solution, it is 
 possible for him to ask to back up in the graph to the first OR node higher 
 than the current node, and try a different approach. This backup mechanism 
 may be used successively to return to a point in the solution where he 
 understands what must be done. PATTIE maintains a trace of the nodes 
 visited during traversal, so backing up is simply a matter of backing up 
 in the trace to an OR node and erasing from the screen those refinements 
 described at nodes visited after that OR. 
 
84 
 
 4. SENTENCE AND DIALOG PROCESSING 
 
 This chapter presents some ideas about natural language 
 processing based on our experience with PATTIE. Traditionally, research 
 in understanding natural language has been directed towards developing 
 general techniques capable of handling all possible aspects of natural 
 language. The work of Winograd (1971) and Woods (1970) has followed this 
 tradition, and resulted in systems which handle very complex sentences but 
 consume large amounts of memory space and processing time. Some more 
 recent systems (Colby (1974), Burton (1974)) have taken a pragmatic 
 approach counter to this tradition, arguing that in smaller, well-defined 
 domains of discourse, natural language processing should be intimately 
 matched to the semantic characteristics of that domain. By matching the 
 language understanding technique to the characteristics of the expected 
 input, these systems are able to understand natural language dialogs while 
 using relatively small amounts of system resources. PATTIE also follows 
 this pragmatic approach, but carries it a step further. Preceding systems 
 have viewed natural language processing at the level of single sentences; 
 PATTIE views language processing at the level of an entire dialog. 
 
 4.1 Characterizing Natural Language Communication 
 
 This section proposes several criteria to provide a rough 
 characterization of various types of natural language discourse. In order 
 to describe these criteria, we must first make a few definitions. Let's 
 say that a "unit of communication" is that string of words output by a 
 
85 
 
 person before requiring a response. A unit of communication in an 
 interactive dialog may be a single word, a short phrase, a sentence 
 (perhaps consisting of several phrases), or even a paragraph of several 
 sentences. The whole dialog may then be viewed as a sequence of units 
 of communication, alternately produced by the person and by the computer. 
 Our proposed classification scheme, however, is concerned only with 
 inputs produced by the person. 
 
 The "degree of isolation" between units of communication is 
 determined by how much the content of previous inputs must be used to 
 understand the current input. This is essentially a measure of the amount 
 of pronoun reference and ellipsis (omission of one or more words which 
 must be understood from context) which occurs in the inputs. For example, 
 the units of communication which make up the input stream below have a low 
 degree of isolation: 
 
 What is the voltage at node 8? 
 
 What is it at node 1? 
 
 Between nodes 7 and 8? 
 The following sequence, on the other hand, asks the same questions but 
 exhibits a high degree of isolation: 
 
 What is the voltage at node 5? 
 
 Now find the voltage at node 1. 
 
 i 
 
 What is the voltage between nodes 7 and 8? 
 
 Given these two definitions, we propose that natural language 
 communication at the dialog level be characterized by two criteria: 
 (1) how well the next unit of communication can be 
 predicted based on the preceding dialog; 
 
86 
 
 and (2) the degree of isolation between units of 
 communication which make up the dialog. 
 The predictability of the next unit of communication may be viewed as a 
 measure of how goal-directed the dialog is. In Winograd's Blocks World, 
 for example, there is no overall goal to be achieved, and successive 
 inputs could easily deal with completely unrelated topics. In a dialog 
 with a definite objective (such as PATTIE's refinement dialog), the inputs 
 are more likely to follow a logical pattern and thus be predictable. 
 Degree of isolation is an independent measure of how closely interrelated 
 the elements of the dialog are. 
 
 At the unit level, we claim natural language communication can 
 be characterized by three additional criteria, independently of the dialog 
 as a whole. The three criteria are: 
 
 (1) grammatical correctness of the unit of communication; 
 
 (2) complexity of the unit of communication; 
 and (3) length of the unit of communication. 
 
 For example, the simplest form of communication might be the one- or two- 
 word commands one uses with a small child (i.e., "Stop" or "Come here"). 
 At the other extreme are the long, faultlessly grammatical sentences of 
 literary works, with all the syntactic complexity such sentences can display, 
 
 These criteria can be used to characterize natural language 
 understanding systems as well as natural language communication, as 
 discussed in the next two sections. 
 
 4.2 Dialog Processing Methods 
 
 Section 2.2 discussed the ability of different language under- 
 standing techniques to use previous inputs to help understand the current 
 
87 
 
 one. At one extreme, the systems of Woods, Winograd, and Burton have 
 elaborate facilities for disambiguating pronoun references and 
 substituting for ellipses. Colby's system has a much simpler approach 
 based on using the world model to establish substitutions for possible 
 pronoun uses (Faught, et al. (1974)). At the other extreme, PATTIE's 
 simple pattern matching approach has no provision for handling pronouns 
 or ellipses. 
 
 A similar range of abilities exists for the other dialog 
 characterizing criterion. For processing methods, the equivalent of 
 predictability of inputs in the dialog is how much the processing routines 
 anticipate the next input. Figure 4.1 shows where the various techniques 
 fall in this range. 
 
 At one extreme are those systems which process one unit of 
 communication at a time, with no consideration of previous inputs. The 
 understanding process begins with the same conditions for each input. 
 Examples of this approach are ELIZA and most information retrieval systems. 
 At the other extreme are those systems in which the program controls the 
 dialog, which requires total anticipation of all the inputs and the order 
 in which they occur. 
 
 Systems such as Woods' and Winograd' s have only a little 
 anticipation of future units of communication, in that they maintain a 
 history of the entire dialog in anticipation of pronouns and the like 
 occuring in future inputs. Their interest in processing natural language 
 discourse in a general manner forbids anticipating specific elements of 
 any dialog at the time the programs were written. 
 
 Burton's and Colby's systems exhibit slightly more anticipation 
 of future inputs. Both methods require analysis of the domain of discourse 
 
88 
 
 none 
 
 total 
 
 ELIZA 
 
 Information retrieval systems 
 
 Winograd, Woods 
 
 Burton, Colby 
 
 PATTIE 
 
 Traditional CAI tutorial 
 
 Figure 4.1 Anticipation of next unit of 
 communication from preceding 
 dialog 
 
89 
 
 to establish possible inputs before the program is written. However, 
 they don't anticipate the order in which the inputs occur either at the 
 time the program is written or during execution. 
 
 PATTIE, on the other hand, anticipates both what the units of 
 communication will be, and in approximately what order they will occur. 
 The AND-OR graph indicates the order in which units of communication are 
 expected to occur, and provides an estimate of the likelihood of any 
 particular unit of communication being input. The most highly anticipated 
 inputs are those corresponding to branches leaving the current node. Inputs 
 corresponding to branches further down the graph from the current node are 
 not so highly anticipated (Figure 4.2). 
 
 It is in this sense that PATTIE is concerned with processing 
 entire dialogs. The AND-OR graph may be considered as a kind of finite 
 state acceptor of a whole dialog, analogous to a finite state machine 
 which may accept single sentences. 
 
 4.3 Sentence Processing 
 
 Just as the various language processing methods exhibited a 
 range of techniques for processing natural language discourse at the dialog 
 level, they also provide a range of techniques for processing individual 
 units of communication. At one end, simple pattern matching techniques 
 such as PATTIE uses are most effective when inputs are relatively short 
 and sentence construction fairly simple. The ability of Colby's system 
 to separate embedded clauses and prepositional phrases from the main 
 clause allows it to accept more complex units of communication. This same 
 segmenting ability, coupled with the fuzzy matching ability, makes it 
 easier to maintain patterns to match longer inputs, as well. Neither of 
 
90 
 
 unlikel 
 
 current node 
 
 greatly 
 •anticipated 
 
 ■somewhat 
 "anticipatec 
 
 Figure 4.2 Anticipation of inputs 
 
 
91 
 
 the pattern matching approaches requires that units of communication be 
 grammatically correct. 
 
 Still longer and more complex inputs can be handled by SOPHIE' s 
 semantic grammar technique, at least partially because of the greater 
 clarity of the grammatical representation compared to long lists of 
 patterns. Inputs must be grammatically correct with respect to the 
 semantic grammar (which is equivalent to saying they must be semantically 
 meaningful), but because some input words can be skipped, they need not 
 be rigidly correct with respect to English syntax. 
 
 Finally, the linguistic analysis capabilities of systems like 
 those of Woods and Winograd allow such systems to understand quite long 
 and complex units of communication (e.g., "Is there anything which is 
 bigger than every pyramid but is not as wide as the thing which supports 
 it?"). In exchange for such abilities, linguistic analysis systems demand 
 inputs which are syntactically correct with respect to their grammars, and 
 require processing times significantly longer than other systems. 
 
 4.4 Understanding Problem Solving Dialogs 
 
 One of the few attempts made to study the characteristics of 
 interactive communication is the work of Chapanis (1975). He was concerned 
 with modeling human-computer interaction, with the computer acting as a 
 source of knowledge for a human trying to solve a problem. Chapanis' 
 primary interest was on the effect of different channels of communication 
 on the problem solving activity, and in the interest of simplicity, he 
 studied the interaction of two humans instead of a human-computer pair, 
 but his research still produced some interesting results on the characteristics 
 of problem solving dialogs. 
 
92 
 
 Figure 4.3 reproduces a portion of a protocol involving problem 
 solving via a teletype-like communication channel. There is a virtually 
 complete lack of correct grammar in all of the sentences, and misspellings 
 and run-on words abound. Those sentences in Figure 4.3 which were not 
 copied from the problem instruction sheet are fairly short (two to six 
 words) and simply constructed. Moreover, there are only a few cases where 
 pronouns were used. Although the two subjects who produced this protocol 
 were inexperienced typists, Chapanis reports that the same characteristics 
 were found in protocols involving experienced typists. The author's 
 personal experience on the PLATO system supports this observation; typing 
 speed is so much slower than thought processes that most errors are simply 
 ignored in an attempt to speed up communication. 
 
 Such characteristics rule out linguistic analysis methods of 
 understanding natural language, and argue in favor of some sort of pattern 
 matching approach for processing problem solving dialogs. The simple 
 pattern matching methods used in PATTIE have proven adequate for two 
 reasons. 
 
 First, student inputs we have experienced with PATTIE have 
 characteristics similar to those described by Chapanis: relatively short 
 and straightforward, slight use of pronouns, and, to a lesser extent than 
 in Figure 4.3, ungrammatical (or fragmentary) inputs (primarily missing 
 words, for example "First, if T is a constant" or "if-then statement"). 
 
 Secondly, the well-defined semantic context provided by each 
 node in the refinement graph helps reduce possible ambiguous matches. The 
 most likely student inputs at a node are exactly those which correspond to 
 transition phrases tagged to branches leaving that node, or nodes slightly 
 
93 
 
 Koahcaddoyouknowhowto put this toghcx* 
 
 ill try it.s a trash tutor ill type you the directions ok 
 
 put axle thru 3Sth holes from outside 
 
 38th ho les / ?? yes 
 
 put 1 handlebar on back of each outer frame line up bolt holes 
 
 what does outer frame loo k like? its like a (W) 
 
 put bottom frame co outer frames on front + rear of outer frames 
 
 ok use 1+12 bolts 
 
 are your parts labled by lettrs ??? 
 
 nookthe thing looks like a cart with room for 2 trash cans the part 
 
 that looks like this(XX)goes on the bottom +the 2(W) parts go on the sids 
 
 put male ends ? into female ends 
 
 what does that mean? i dont no 
 
 it looks like 3(u)s 
 
 what? 2(u)s go into each other then theyare put on other u +put 
 
 on W put top frame to front of outer fr.+to handlbar 2 1/4 
 
 bolts put center support fr. inside topfr. use 2 1/4 bo. thru 
 
 center of top fr. put 2 1/12 bolts thrub center of side fr., 
 
 bottomfr. 2 bottom of center support fr. 
 
 okput on wheels 3 spoks on outside put on hubcap with hammer 
 
 put oh handgrips DO ALL THESE STEPS FOR BOTH SIDES ok????? 
 
 Figure 4.3 Problem solving protocol 
 (from Chapanis (1975)) 
 
94 
 
 lower in the graph. Therefore, if an input matches one of these 
 branches, there is a high probability that the intended meanings are the 
 same. 
 
 Although PATTIE's simple understanding system has proven 
 adequate, our experience has indicated two possible extensions which 
 could provide improved performance in understanding such problem solving 
 dialogs. The simple scheme understood 61% of nearly 500 inputs from the 
 most recent group of 10 introductory students to interact with the system. 
 PATTIE's vocabulary and meaning lists have, of course, been modified so 
 that all of the meaningful missed inputs will be understood in subsequent 
 uses. But in examining the inputs which were not understood, it was 
 discovered that 37% failed to match a pattern because they contained an 
 extra keyword. Simply adding a fuzzy matching capability similar to 
 Colby's would have boosted PATTIE's understanding rate to 76% without 
 changing either vocabulary or meaning lists. An additional 22% of the 
 erroneous inputs were composed of two simple patterns joined by a 
 conjunction (usually and, then, or else); the existence of a clause 
 separating capability could have increased the understanding rate to 84%. 
 
95 
 
 5. SUMMARY AND CONCLUSION 
 
 5.1 Summary 
 
 The basic paradigm for problem solving is that of problem 
 reduction: the problem solver selects one of a set of "problem reduction 
 operators" which is applied to the original problem to produce a new, 
 somewhat easier problem, and the process is iteratively repeated until 
 a solution is reached. Existing tutorial programs for problem solving 
 (Kimball (1973), Goldberg (1973), Brown and Burton (1974)) have dealt with 
 subjects which might best be described as "quantitative." These subjects 
 generally require only a small number of problem reduction operators, and 
 generally ones which can be applied by a tutorial program to yield a new, 
 well-defined problem. 
 
 In contrast, an automated tutor for top-down programming must 
 deal with a subject (successive refinement) which is essentially 
 "qualitative." There are a large number of problem reduction operators 
 (all possible refinements) which are applicable to a given problem; however, 
 few of them are applicable to several problems. Also, since refinements 
 may be expressed using natural language, it is difficult for a tutorial 
 program to apply a reduction operator to produce a new problem. Because 
 of these characteristics, developing PATTIE required solving several 
 problems: 
 
 (1) the qualitative nature of the refinement process 
 makes a natural language dialog capability a 
 necessity; 
 
96 
 
 (2) since little if any knowledge is applicable to 
 all program development processes (other than 
 the method used), PATTIE required quite a bit of 
 knowledge about every problem; 
 
 (3) a tutorial strategy adequate to teach the top- 
 down programming method had to be devised and 
 implemented. The ability to adapt somewhat to 
 each student's performance was a desirable part 
 of such a strategy. 
 
 We have chosen to implement only a simple natural language 
 capability, but have proposed several criteria for characterizing both 
 methods of processing natural language and types of natural language 
 discourse. 
 
 Solution of the remaining problems was possible because of the 
 following aspects of PATTIE' s design: 
 
 (1) The use of an AND-OR graph as a model of the top- 
 down programming process. A specific instance of 
 such a graph represents knowledge about both the 
 possible solution programs for a given problem 
 and the processes used to develop those programs. 
 Adapting an AND-OR graph for such tutorial use 
 required adding features to provide students with 
 hints, allow intermixture of programming language 
 and natural language statements, and invoke 
 subprocedures prior to their detailed design. 
 
 (2) The existence of a student model which is intimately 
 tied in with the AND-OR graph and is based on a 
 
97 
 
 list of semantic concepts relevant to the 
 problem area. 
 (3) An interaction control routine which traverses 
 
 the AND-OR graph, conducts an interactive dialog 
 with the student, and maintains a screen display 
 of the student's developing solution which 
 reflects the status of the graph traversal. 
 
 The techniques used by PATTIE are also applicable in other subject areas 
 
 with similar characteristics. 
 
 5.2 Recommendations for Further Work 
 
 There are several directions in which future work could proceed 
 on systems similar to PATTIE. First, and I feel most importantly from an 
 educational standpoint, is to provide a fourth level of error correction, 
 one with the ability to provide more detailed explanations than the ERROR 
 branch facility. In situations where the student has missed all 
 refinements for several successive nodes, it is quite likely that he 
 really has no idea of what the end result of this portion of the refinement 
 process would be. A student who doesn't know where he's going won't 
 appreciate the method used to get there. PATTIE should be able to provide 
 an explanation of the overall goal being pursued, and what high-level tasks 
 must be accomplished to achieve that goal. Such capability could be 
 provided in PATTIE by allowing the problem expert to specify program 
 segments to be executed in such situations, segments which would supply 
 the necessary explanation. The interesting problem is whether there is 
 a more general solution which could be employed with only minimal effort 
 from the problem expert. 
 
98 
 
 A second extension relates strictly to structured programming. 
 PATT1E has dealt exclusively with refinements of abstract tasks needed 
 to solve the problem. An equally important aspect of structured 
 programming is refinement of the abstract data representations. What 
 additions must be made to PATTIE to provide equal emphasis to data 
 structure refinement? Since data structures are generally more visually 
 oriented than task oriented, perhaps some sort of visual hint facility 
 would be useful. 
 
 Thirdly, PATTIE' s natural language capabilities could be 
 expanded to include question answering. Students engaged in an inter- 
 active dialog with a tutorial program tend to assume that such a tutor 
 has all the capabilities of a human tutor, including the ability to 
 answer arbitrary questions (for example, questions about the problem, 
 or, in PATTIE' s case, about the programming language). Students will 
 ignore help sequences accessible with a single keypress in favor of 
 asking such questions. Investigation of the problems of such an extension 
 and the types of questions asked could provide useful insight into student 
 misunderstandings about structured programming. 
 
 Finally, the idea mentioned in Section 3.4.2 of relaxing the 
 rigid traversal order PATTIE enforces by including unordered AND nodes in 
 the solution graph should be investigated. 
 
 In addition to the. foregoing extensions to PATTIE, the problem 
 of characterizing human communication and computerized language under- 
 standing mechanisms, touched on so briefly in Chapter 4, is well worth 
 significant research efforts. The existence of such a characterization 
 would simplify the problems of using natural language man-machine dialogs 
 in a host of practical applications. 
 
99 
 
 5.3 Conclusion 
 
 We have described the design of a tutorial program intended to 
 give beginning programming students the sort of detailed dynamic example 
 of the successive refinement process which Denning (1974) feels is 
 essential to understanding structured programming. Preliminary reactions 
 from about 30 students who have interacted with PATTIE indicate that the 
 students also feel PATTIE is a useful aid in understanding the top-down 
 programming process. Integrating PATTIE into the ACSES curriculum should 
 help to produce programmers with true problem solving capabilities. 
 
100 
 
 LIST OF REFERENCES 
 
 Alpert, D. and D. L. Bitzer (1970), "Advances in Computer-based Education," 
 Science 167 (1970), pp. 1582-1590. 
 
 Brown, J. S. and R. R. Burton (1974), "SOPHIE — A Pragmatic Use of 
 
 Artificial Intelligence in CAI," Proceedings of the ACM National 
 Conference, San Diego, California, November, 1974. 
 
 Burton, R. R. (1974), "A Semantically Centered Parsing System for Mixed- 
 Initiative CAI Systems," presented at Association for Computational 
 Linguistics Conference, Amherst, Massachusetts, July, 1974. 
 
 Carbonell, J. R. (1970), "AI in CAI: An Artificial-Intelligence Approach 
 to Computer-Assisted Instruction," IEEE Transactions on Man-Machine 
 Systems, Vol. MMS-11, No . 4 , December, 1970, pp. 181-9. 
 
 Chapanis, A. (1975), "Interactive Human Communication," Scientific American, 
 Vol. 232, No. 3, March, 1975. 
 
 Colby, K. M. , R. C. Parkinson, and B. Faught (1974), "Pattern Matching 
 
 Rules for the Recognition of Natural Language Dialogue Expressions," 
 Stanford AI Lab Memo AIM-234, June, 1974. 
 
 Collins, A. M., J. R. Carbonell, and E. H. Warnock (1973), "Analysis and 
 
 Synthesis of Tutorial Dialogues," Bolt, Beranek, and Newman Technical 
 Report All, March, 1973. 
 
 Conway, R. W. and D. Gries (1973), An Introduction to Programming: A 
 Structured Approach using PL/1 and PL/C , Winthrop Publishers, 
 Cambridge, Massachusetts, 1973. 
 
 Denning, P. J. (1974), "Guest Editor's Overview," Computing Surveys, Vol. 6, 
 No. 4, December, 1974. 
 
 Dijkstra, E. W. (1970), "Notes on Structured Programming," Technical Report 
 No. 70-WSK-03, Technological University, Eindhoven, The Netherlands. 
 
 Ells, T. D. and P. Freeman (1973), "Design Rationalization of Three BASIC 
 Systems," Technical Report No. 38, Department of Information and 
 Computer Science, University of California, Irvine, November, 1973. 
 
 Faught, B., K. M. Colby, and R. Parkinson (1974), "The Interactions of 
 Inferences, Affects, and Intentions in a Model of Paranoia," 
 Stanford AI Lab Memo AIM-253, December, 1974. 
 
101 
 
 Goldberg, A. (1973), "Computer-Assisted Instruction: The Application of 
 Theorem-Proving to Adaptive Response Analysis," Technical Report 
 No. 203, Institute for Mathematical Studies in the Social Sciences, 
 Stanford University, May, 1973. 
 
 Goldstein, I. (1974), "Understanding Simple Picture Programs," Technical 
 Report AI-TR-294, MIT AI Lab, September, 1974. 
 
 Green, C., R. Waldinger, D. Barstow, R. Elschlager, D. Levat, B. McCune, 
 D. Shaw, and L. Steinberg (1974), "Progress Report on Program 
 Understanding Systems," Stanford AI Lab Memo AIM-240, August, 1974. 
 
 Gries, D. (1974), "What Should We Teach in an Introductory Programming 
 
 Course," SIGCSE Bulletin, Vol. 6, No. 1, February, 1974, pp. 81-89. 
 
 Grignetti, M. , L. Gould, C. Hauseman, A. Bell, G. Harris, and J. Passafiume 
 (1974), "Mixed-Initiative Tutorial System to Aid Users of the oN-Line 
 System (NLS)," Technical Report ESD-TR-75-58, Electronic Systems 
 Division, Hanscom AFB, Bedford, Mass., November, 1974. 
 
 Kimball, K. B. (1973), "Self -Optimizing Computer-Assisted Tutoring: Theory 
 and Practice," Technical Report No. 206, Institute for Mathematical 
 Studies in the Social Sciences, Stanford University, June, 1973. 
 
 Koffman, E. B. and S. E. Blount (1973), "Artificial Intelligence and 
 
 Automatic Programming in CAI," Proceedings of Third International 
 Joint Conference on AI, Stanford Research Institute, 1973, pp. 86-94. 
 
 Nievergelt, J. and E. M. Reingold (1973), "Automating Introductory Computer 
 Science Courses," SIGCSE Bulletin, Vol. 5, No. 1, February, 1973, 
 pp. 24-25. 
 
 Nilsson, N. J. (1971), Problem Solving Methods in Artificial Intelligence , 
 McGraw-Hill, New York, New York, 1971. 
 
 Polya, G. (1957), How to Solve It; A New Aspect of Mathematical Method , 
 2 n d ed., Doubleday, Garden City, New Jersey, 1957. 
 
 Quillian, M. R. (1968), "Semantic Memory," in Minsky, M. (ed.), Semantic 
 Information Processing , MIT Press, Cambridge, Massachusetts, 1968. 
 
 Snark, B. H. (1972), "Diverse Approaches to Teaching Programming: Three 
 
 Reports," in Turski, W. (ed.), Programming Teaching Techniques , North- 
 Holland, Amsterdam, 1972. 
 
 Sussman, G. (1973), "A Computational Model of Skill Acquisition," 
 Technical Report AI-TR-297, MIT AI Lab, August, 1973. 
 
 Tenczar, P. and W. Golden (1972), "Spelling, Word, and Concept Recognition," 
 CERL Report X-35, Computer -Based Education Research Laboratory, 
 University of Illinois, October, 1972. 
 
 Weinberg, G. , N. Yasukawa, and R. Marcus (1973), Structured Programming in 
 PL/C , Wiley and Sons, New York, New York, 1973. 
 
102 
 
 Wexler, J. D. (1970), "Information Networks in Generative Computer- 
 Assisted Instruction," IEEE Transactions on Man-Machine Systems, 
 Vol. MMS-11, No. 4, December, 1970, pp. 190-202. 
 
 Weizenbaum, J. (1966), "ELIZA — A Computer Program for the Study of 
 
 Natural Communication Between Man and Machine," Coram. ACM, Vol. 9, 
 No. 1, January, 1966. 
 
 Weizenbaum, J. (1967), "Contextual Understanding by Computers," Coram. 
 ACM, Vol. 10, No. 8, August, 1967. 
 
 Winograd, T. (1971), "Procedures as a Representation for Data in a Computer 
 Program for Understanding Natural Language," Technical Report AI-TR-17, 
 MIT Artificial Intelligence Laboratory, February, 1971. 
 
 Winograd, T. (1974), "Five Lectures on Artificial Intelligence," Stanford 
 AI Lab Memo AIM-246, September, 1974. 
 
 Wirth, N. (1971), "Program Development by Stepwise Refinement," Coram. ACM, 
 Vol. 14, 1971, pp. 221-227. 
 
 Wirth, N. (1973), Systematic Programming , Prentice-Hall, Englewood Cliffs, 
 New Jersey, 1973. 
 
 Woods, W. A. (1970), "Transition Network Grammars for Natural Language 
 Analysis," Comm. ACM, Vol. 13, No. 10, October, 1970. 
 
103 
 
 APPENDIX 
 
 PATTIE is implemented in the TUTOR author language on the PLATO 
 IV CAI system. The program itself occupies a total of 9923 60-bit words, 
 divided as follows: 
 
 traversal routines 
 
 6699 words 
 
 backup 
 
 1279 
 
 vocabulary 
 
 1492 
 
 meaning lists 
 
 453 
 
 
 9923 words 
 
 Notice that the vocabulary and meaning lists are included as part of the 
 TUTOR program, although they are actually data in the strict sense. 
 
 Total data requirements are 1244 words, with the following 
 breakdown: 
 
 refinement graph 
 
 266 words 
 
 display words 
 
 229 
 
 messages 
 
 469 
 
 screen display and trace 
 
 140 
 
 miscellaneous operational 
 
 
 variables 
 
 140 
 
 
 1244 words 
 
 Figure A.l shows the flow of control between major functional 
 areas of the traversal routines, and Figure A. 2 shows the use of major 
 data elements by the same functional areas. 
 
104 
 
 >H 
 
 
 
 « 
 
 5 
 
 
 
 
 tu 
 
 K 
 
 H 
 
 M 
 
 rn 
 
 n 
 
 t^ 
 
 CJ> 
 
 M 
 
 
 W 
 
 w 
 
 n 
 
 
 S 
 
 « 
 
 Figure A.l High-level control flow 
 
105 
 
 
 » 
 
 H 
 
 3 
 
 Q 
 
 a. 
 
 O 
 
 CO 
 
 
 M 
 
 
 Q 
 
 S5 
 
 g 
 
 S 
 
 U 
 
 M 
 
 < 
 
 IX 
 
 o 
 
 CO 
 
 s 
 
 Figure A. 2 Data usage 
 
106 
 
 When in use by students, PATTIE uses from 2 to 3 milliseconds 
 of CPU time per second per student. Response times vary from essentially 
 instantaneously to as much as 5 seconds, depending on total system load 
 and the amount of real time since the student's previous input. 
 
 The remainder of this Appendix describes the major data 
 structures used in PATTIE, which may be divided into two groups: those 
 which are part of the refinement graph, and various operational structures. 
 PLATO's memory word length is 60 bits; however, TUTOR allows an author 
 to define addressable entities, called "segments," less than 60 bits long, 
 which are then packed several segments to a word, and addressed as elements 
 in a one-dimensional array which may span many words. For example, defining 
 a segment TEST as 12 bits results in five TEST segments being packed in each 
 word. Individual segments are referenced as TEST(l), TEST(2), ..., 
 TEST(24), 
 
 The graph is stored in an array of 30-bit segments called NODTBL. 
 Each node is stored as a 30-bit node header followed by one 30-bit branch 
 header for each branch leaving the node. Individual nodes are accessed by 
 using their number as an index into a table of 12-bit pointers (NODACES) 
 which indicate the index of the corresponding node header in NODTBL 
 (Figure A. 3). Each node and branch header is further subdivided. 
 
 The node header contains four fields. To access these fields, 
 the header must be removed from NODTBL and inserted in another 30-bit 
 segment, NHDR30. The four fields are then (Figure A. 4) 
 
 N0DHDR5(1) — 5 bits indicating node type: 1 = AND, 
 2 m OR, 3 = LEAF, 4 = PROC. If this 
 field is negative, all regular (i.e., non- 
 ERROR) branches leaving the node are tagged 
 with PL/1 statements 
 
107 
 
 node number 
 
 
 NODACES 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 Figure A. 3 Node accessing scheme 
 
 NHDR30 
 
 node 
 type 
 
 // reg 
 branches 
 
 # error 
 branches 
 
 ptr to node 
 display msg 
 
 5 10 15 
 
 N0DHDR5(1) NODHDR5(2) NODHDR5(3) 
 
 NDHDR15(2) 
 
 29 
 
 Figure A. 4 Node header format 
 
 BHDR 
 
 meaning list 
 number 
 
 pointer to 
 display copy 
 
 next node number 
 
 10 
 
 20 
 
 29 
 
 BRCHDR(l) 
 
 BRCHDR(2) 
 
 BRCHDR(3) 
 
 Figure. A. 5 Branch header format 
 
108 
 
 NODHDR5(2) — 5 bits telling number of regular branches 
 
 leaving this node 
 NODHDR5(3) — 5 bits telling number of ERROR branches 
 
 leaving this node 
 NDHDR15(2) — 15-bit pointer to any display message attached 
 
 to this node (pointer into MSGTBL) . If 
 
 > — English message 
 — no message 
 
 < — PL/1 statement 
 
 The branch header contains three fields, and must be removed from 
 NODTBL and inserted in a 30-bit BHDR before they may be accessed. The 
 fields are (Figure A. 5): 
 
 BRCHDR(l) — meaning list number corresponding to branch 
 transition phrase. If 
 
 > — English phrase 
 
 — universal ERROR branch, 
 no phrase 
 
 < — PL/1 statement for display 
 BRCHDR(2) — pointer to display copy of transition phrase 
 
 (pointer into MSGTBL). If 
 
 > — English phrase 
 
 — expected ERROR branch, 
 no display 
 
 < — PL/1 statement 
 BRCHDR(3) — number of node branch leads to 
 
109 
 
 The programming/problem solving concepts associated with nodes 
 and branches are kept in a separate CONCTBL of 12-bit segments. There is 
 one CONCTBL entry for each node and branch, in the same order as the node 
 and branch headers in NODTBL. Hence NODACES also gives the correct index 
 into CONCTBL. 
 
 Display messages are coded as a series of indices into a list of 
 display words (DISPLST) to save space. The coded messages are kept in a 
 table of 60-bit words called MSGTBL, and for decoding must be moved to a 
 field called MHDR, which is subdivided into 10-bit segments: 
 MSGHDR(l) — no longer used 
 MSGHDR(2) — number of words in display message; indicates 
 
 number of 10-bit segments to be decoded 
 MSGHDR(3) — number of pointers to this message; allows 
 sharing of one message copy by several 
 branches or nodes 
 MSGHDR(4) - MSGHDR(6) — indices into DISPLST for single 
 
 words in the message. 
 MSGHDR(7) - MSGHDR(12) 
 etc. 
 
 One of the primary operational structures is the active AND stack 
 (ANDSTK) which forms the basis for the depth-first search algorithm. ANDSTK 
 is an array of 30-bit segments, each of which contains status information 
 on a single partially completed AND node. The index of the top entry in 
 ANDSTK is indicated by ANDPTR. Status information is accessed by removing 
 a 30-bit segment to AHD30, which has the following subfields: 
 
 AHD12(1) — 12-bit segment containing the node number of 
 the AND node whose status is described 
 
110 
 
 AHD2(7) — 2-bit segments indicating the traversal status 
 through 
 
 AHD2(15) 
 
 of each regular branch leaving that node. If 
 -1 — branch not yet described 
 
 — branch described but not 
 
 yet traversed 
 
 1 — branch traversed 
 
 The PROC stack (PRCSTK) is an array of 12-bit segments containing 
 the numbers of PROC nodes which must be traversed. The index of the top 
 entry on the stack is contained in PSTKPTR. 
 
 Additional stacks are needed for the depth-first lookahead search 
 for branches matching a student input. The two stacks are NSTK, consisting 
 of 12-bit segments which contain the numbers of nodes on the lookahead path 
 from the current node, and BSTK, consisting of 6-bit segments containing 
 the number of the last branch traversed from the corresponding NSTK node during 
 lookahead. The index of the top entry for these two stacks is contained in 
 PTR. 
 
 OR nodes which are determined to have been satisfied during 
 lookahead are placed on a list so that they are simply traversed when they 
 become the current node. This list is SATOR20, consisting of 20-bit 
 segments, all but the first of which are divided into two subfields 
 
 SAT0R20(1) — points to the first open element in 
 the list of satisfied OR nodes 
 
 SATORlO(odd) — 10-bit segment containing the number of 
 an OR node satisfied during lookahead 
 
 SATORlO(even) — 10-bit segment indicating which branch 
 to follow out of the OR node pointed at 
 by the immediately preceding SATORlO(odd) 
 
Ill 
 
 The student model variables consist of three arrays associated 
 with the concepts, plus three other counters. The three arrays are: 
 
 CNUMUSE — 12-bit counters, one for each concept, 
 
 indicating the number of times that concept 
 has been correctly employed 
 
 CNUMCH — 12-bit counters, one for each concept, 
 
 indicating the number of opportunities the 
 student has had to correctly employ that 
 concept 
 
 INCR — 3-bit segments, one for each concept, 
 containing an increment to add to the 
 corresponding CNUMCH the next time that 
 concept forces display of a node. CNUMUSE 
 is always increased by one. 
 
 The three (60-bit) counters are: 
 
 TOTLEVS — total number of levels spanned by all 
 
 student inputs 
 
 NUMTRIS — total number of attempts at all nodes 
 
 NUMNODS — total number of nodes the student has been 
 
 asked to submit refinements at 
 
 The final set of data structures are those concerned with 
 maintaining the screen display. The primary structures are: 
 
 SCRNTBL — one 5-word element for each line in the program 
 
 area. Holds 50 characters to allow immediate 
 
 regeneration of screen display 
 STMTNUM — one 60-bit word for each line in the program 
 
 area. Holds task name for each line 
 
112 
 
 TOP — line number of uppermost blank line in the 
 
 program area (just below current task) 
 BOTTOM — line number of lowest blank line in the 
 
 program area (just above the program area 
 stack) 
 CURSTMT — task name of current task 
 
 SCRNPRG — one 6-bit segment for each line in the program 
 area, contains the total number of characters 
 of PL/1 code currently contained in that line 
 SCRNCOD — one 2-bit flag for each line of the program 
 area. If 
 
 1 — PL/1 code currently on the 
 
 corresponding line 
 — no PL/1 code on that line 
 DCLIST — one 6-bit segment for each line in the program 
 area. Each segment contains two subfields: 
 DCLHDR3(1) — type of variable declared on that 
 line 
 
 — no variable 
 
 1 — CHAR variable 
 
 2 — FIXED variable 
 
 3 — FLOAT variable 
 DCLHDR3(2) — number of variables declared on 
 
 that line 
 
 The final data structure is a stack (PRGSTK) containing one 30-bit 
 entry for each active AND node, indicating the number of lines on the 
 screen which will be consumed by display of branches leaving that node. 
 
113 
 
 Each 30-bit segment has several subfields: 
 
 ANDPRG6(1) — 6-bit segment containing total number 
 
 of lines for that node. DCL statements 
 are not included in the count 
 
 ANDPRG3(3) — 3-bit segment indicating number of lines 
 
 through juv u • i l. 
 
 on screen consumed by branch i-2, where 
 
 ANDPRG3(10) 
 
 i is the index. Z ANDPRG3(i) = ANDPRG6(1) 
 
 The top entry in PRGSTK is indicated by ANDPTR, the stack pointer for the 
 active AND stack. 
 
114 
 
 VITA 
 
 The author, Ronald Lee Danielson, was born In Duluth, Minnesota, 
 on October 5, 1945. He was graduated from the University of Minnesota, 
 Duluth, with a Bachelor of Arts degree (summa cum laude) in mathematics 
 in June, 1967. He received a Master of Science degree in mathematics from 
 Northwestern University in August, 1968. Since September, 1971, he has 
 been a teaching and research assistant in the Department of Computer Science 
 of the University of Illinois. He is a member of the Association for 
 Computing Machinery. 
 
BLIOGRAPHIC DATA 
 
 IEET 
 
 I. Report No. 
 
 UIUCDCS-R-75-753 
 
 3. Re< tpient a Accession N< 
 
 Title uiJ Sunt K If 
 
 PATTIE: AN AUTOMATED TUTOR 
 FOR TOP-DOWN PROGRAMMING 
 
 5. Repon Date 
 
 October, 1975 
 
 ; Author(s) 
 
 Ronald Lee Danielson 
 
 8. Performing Organization Rept. 
 
 No. 
 
 I Pcrfoniung Organization Name and Address 
 
 Department of Computer Science 
 
 University of Illinois at Urbana-Champaign 
 
 Urbana, Illinois 61801 
 
 10. Project/Task/Work Unit No. 
 
 11. Contract /Grant No. 
 NSF EC-41511 
 
 nsoring Organization Name and Address 
 
 National Science Foundation 
 Washington, D.C. 
 
 13. Type of Report & Period 
 Covered 
 
 14. 
 
 mentary Notes 
 
 ^h>tr.ict s 
 
 The Department of Computer Science at the University of Illinois has under- 
 taken a project to automate a substantial portion of its introductory computer 
 science courses on the PLATO computer-assisted instruction (CAI) system. Experience 
 with teaching programming skills has indicated the importance of teaching beginning 
 students how to develop a program, as well as specific programming languages. This 
 thesis describes a CAI tutor for top-down programming which provides a detailed 
 example of the stepwise refinement process. The tutor guides the student to a 
 solution of a specific problem by judging the correctness of suggested refinements 
 input by the student in natural language. An AND-OR graph is used as the model of 
 the top-down programming process. Techniques used in the tutor are sufficiently 
 general that the design may be used for tutors in other subject areas. 
 
 1 Kc\ Words and Document Analysis. 17a. Descriptors 
 
 computer -assisted instruction 
 
 CAI 
 
 structured programming 
 
 teaching programming 
 
 computer science education 
 
 problem solving 
 
 artificial intelligence applications 
 
 lentifiers Open-Ended Terms 
 
 V < OSAT1 Field/Group 
 
 R\\ liability State 
 
 19. Security Class (This 
 Report ) 
 
 UNCLASSIFIED 
 
 20. Security Class (Thi 
 
 Page 
 UNCLASSIFIED 
 
 21. No. of Pages 
 
 22. Price 
 
 ?l|W N HS-35 (10-70/ 
 
 USCOMM-DC 40329-P7I 
 
INSTRUCTIONS FOR COMPLETING FORM NTIS-35 (10-70) (Bibliographic Data Sheet based on COSATI 
 Guidelines to Format Standards for Sc tentific and Technical Reports Prepared by or for the Federal Government, 
 FH-180 600). 
 
 1. Report Number. Each report shall carry a unique alphanumeric designation. Select one of the following types: (a) alpha- 
 nunn [i. designation provided by the sponsoring agency, e.g., F AA-RD-68-09; or, if none has been assigned, (b) alphanu- 
 meric designation established by the performing organization e.g., FASEB-NS-87; or, if none has been established, (c i 
 alphanumeric designation derived from contract or grant number, e.g., PH-43-64-932-4. 
 
 2- i cave hlank. 
 
 3. Recipient's Accession Number. Reserved for use by each report recipient. 
 
 4- Title and Subtitle. Title should indicate clearly and briefly the subject coverage of the report, and be displayed promi- 
 ncntlv. x ' i subtitle, if used, in smaller r > pe or otherwise subordinate it to main title. When a report is prepared in more 
 i ban one volume, repeat the primary title, add volume number and include subtitle for the specific volume. 
 
 5- Repcr* Date. I a< h '<■ port shall carry a date indicating at least month and year. Indicate the basis on which it was select 
 
 . . itc of issue-, date of approval, date 01 preparation. 
 
 6- Performing Organization Code. Leave blank. 
 
 7. Author(s). Give narne(s) in conventional order (e.g., John R. Doe, or J.Robert Doe). List author's affiliation if it d if ft 
 I rforming organization. 
 
 8- Performing Organi zotion Report Number. Insert il performing organization wishes to assign this number. 
 
 9. Performing Orgoni zotion Name and Address. Give name, street, city, state, and zip code. List no more than two level' ol 
 mizationa'l hierarchy. Display the name of the organization exactly as it should appear in Government indexes such 
 a USGRDR-I. 
 
 10. Project Task Work Unit Number, ''sc the proiect, task and work unit numbers under which the report was prepared. 
 
 11. Contract Grant Number. Insert contract or grant number under which report was prepared. 
 
 12. Sponsoring Agency Name and Address. Include zip code. 
 
 1 '< Tyc e of Report ond Period Covered. Indicate interim, final, etc., and, if applicable, dates covered. 
 
 14. Sponsoring Agency Code. Leave blank. 
 
 15. Supplementary Notes. Rnter information not included elsewhere but useful, such as: Prepared in cooperation with... 
 I ran ilation of . . . Presented at conference of . . . To be published in . . . Supersedes ... • Supplements 
 
 16- Abstroct. inc lude a brief (200 words or less) factual summary of the most significant information contained in the repofi 
 1: •!.• report contains a significant bibliography or literature survey, mention it here. 
 
 17. Key Words and Document Analysis, (a). Descriptors. Select from the Thesaurus of Lngineering and Scientific Terms tin 
 
 luthorizc d terms that identify the major concept of the research and are sufficiently spec if ic and precise to be use 
 lex enir •■ for i it aloging. 
 (b). Identifiers and Open-Ended Terms. Use identifiers for project names, code names, equipment designators, etc. I 
 ,'pen-< tided terms written in descriptor form for those subjects for which no descriptor exists. 
 
 fc). COSATI Field Group. Field and Group assignments are to be taken from the 1965 COSATI Subject Category I.i v : 
 it y of documents are mu It id isc iplinary in nature, the primary Field/Group ass ignment (s ) will be the sp< 
 . I human endeavor, or type of physical object. The applicat ion(s ) will be cross-referenced with secondar 
 ignments that will follow tin primary posting(s). 
 
 18. Distribution Stotement. Denote te Icasabi 1 ity to the public or limitation for reasons other than security for example " 
 
 '" iti "■ ' ite any availability to the public, with address and price. 
 
 19 & 20. Security Classification. Do not submit classified reports to the National Technical Information Service. 
 
 21. Number of Page'.. Insert the total number of pa^es, including this one and unnumbered pages, but excluding distribui 
 1*1.. i/ any. 
 
 22. Price. Insert the price set by the National Technical Information Service or the Government Printing Office, if known. 
 
 I 
 
 FGBM NTIS-33 (10-70) USCOMM-uC 40329-' 
 
OCT 1 71175 
 

^ 
 
 ■&