UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN /n&Ti Report No. UIUCDCS-R-75-7^3 ANALYZING SMOOTH FLOWCHARTS- TEACHING STRUCTURED PROGRAMMING IN A COMPUTER-BASED EDUCATION ENVIRONMENT by Daniel Clair Hyde June, 1975 ( DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN URBANA, ILLINOIS Report No. UI UCDCS-R-75-743 ANALYZING SMOOTH FLOWCHARTS: TEACHING STRUCTURED PROGRAMMING IN A COMPUTER-BASED EDUCATION ENVIRONMENT* by Daniel Clair Hyde June, 1975 Department of Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois 61801 *This work was submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science, June, 1975. Digitized by the Internet Archive in 2013 http://archive.org/details/analyzingsmoothf743hyde To my wife Mary Jane because without her encouragement this thesis would not be possible. Ill ACKNOWLEDGEMENT The author wishes to express his gratitude to Professor Sylvian R. Ray for his guidance during the preparation of this thesis. I would like to thank the Department of Computer Science and the Computer-based Education Research Laboratory for their financial support. Thanks are also due to my colleagues Prabhaker Mateti and Kishor Trivedi for valuable discussions during the course of this research. Finally, I wish to thank my wife Mary Jane for her en- couragement and patience. I am also indebted to her for typing the several drafts of this thesis. iv TABLE OF CONTENTS Chapter Page One INTRODUCTION 1 1.1 Statement of Research Problem 1 1.2 The PLATO IV Computer-Based Educational System 2 1.3 The Constraints in a Computer-Based Education Environment 3 1.4 Teaching via Structured Programming .... 4 1.5 Grading Student-Generated Flowcharts .... 5 1.6 Brief Description of Semantic Formulation Method (SFM) 6 1.7 Research Methodology 7 Two REFINEMENT OF THE PROBLEM AND SURVEY OF THE RELEVANT LITERATURE 9 2.1 Restatement of the Problem 9 2.2 Smooth Flowcharts 9 2.2.1 Software Engineering 10 2.2.2 Structured Programming 10 2.2.3 Informal Description of Smooth Flowcharts 12 2.2.4 Formal Definition of Smooth Flowcharts via Web Grammar 14 2.2.5 Why Flowcharts and Structured Programming? 18 2.3 Grading Student-Generated Flowcharts .... 19 2.3.1 Grading by the Test Data Method ... 20 2.3.2 Grading by the Inductive Assertion Method of Program Verification 21 2.3.3 Comparison of SFM with Test Data and Inductive Assertion Methods ... 24 2.4 Equivalence of Programs 27 2.4.1 Global Semantic Equivalence 29 2.5 Artificial Intelligence Methodology .... 33 2.6 Grader vs. Tutor 34 2.7 Summary of Research Methodology 35 Three PROGRAM TO ANALYZE SMOOTH FLOWCHARTS (PASF) ... 36 3.1 Anticipated Student Population Using PASF 36 TABLE OF CONTENTS (Continued) Chapter Page 3.2 Use of PASF in a Course 36 3.3 Instructional Portion of PASF 37 3.4 Program Laboratory of PASF 38 3.5 Input Section of PASF 38 3.5.1 Input Options of PASF 40 3.5.2 Free vs. Structured Approaches of Inputting 41 3.5.3 Fast Feedback on Syntax Errors ... 41 3.5.4 Experiment to Test Utility of Input Routine of PASF 42 3.5.5 List Structure Representation of Student's Flowchart 43 3.5.6 Layout of Boxes and Arrows by PASF 44 3.5.7 SIM--the Programming Language of PASF 46 3.5.8 Intdivide--a Sample Flowchart. ... 48 3.5.9 English Phrases inside Boxes .... 48 3.5.10 Summary of Input Section of PASF. . . 50 3.6 Checking Student-Generated Flowcharts. ... 50 3.6.1 Check for Legal Flowchart 50 3.6.2 Check for Legal Smooth Flowchart 51 3.6.3 General Heuristic Checks 51 3.6.4 Regular Expression Representation of Smooth Flowchart 52 3.6.5 Usefulness of RE Representation of Smooth Flowcharts 55 3.6.6 Heuristic Check for Infinite Loops 56 3.6.7 Algorithm Specific Heuristic Checks of PASF 58 3.7 The Deduction Scheme of PASF 59 3.7.1 Standard Regular Expression Representation (SRE) of Smooth Flowchart 59 3.7.2 Block Diagram of Deduction Scheme 62 3.7.3 The SELECTOR of Deduction Scheme ..... 63 3.7.4 The MATCH0R of PASF 68 3.7.5 The REDUC0R of PASF 71 3.7.6 The TRANSLATOR R of PASF 92 vi TABLE OF CONTENTS (Continued) Chapter Page 3.7.7 Conditions Which Allow Inter- changing of Portions of Student's SRE 93 3.7.8 The TRANSLATOR B of PASF 95 3.7.9 Error Detection in the Deduction Scheme 98 3.7.10 Summary of the Deduction Scheme of PASF 99 3.8 Summary of PASF 101 Four THE ROLE OF THE INSTRUCTOR IN PASF 102 4.1 Design of Exercises 102 4.1.1 Importance of an Instructor's Awareness of the Scope of the Programming Domain Permitted by PASF 103 4.1.2 Design of an Exercise Off-Line ... 103 4.1.3 Design of an Exercise On-Line. . . . 104 4.1.4 English Phrases in Flowchart Boxes 105 4.1.5 The Name of the Exercise 106 4.1.6 The Correct Smooth Flowchart .... 107 4.1.7 Answers to a Series of Questions. . . 107 4.1.8 Storage of Instructor's Flow- chart and Data 109 4.1.9 Summary of Inputting an Exercise. . . 109 4.2 Data, Collected by PASF, Allowing Instructor to Check Student Comprehension 110 Five DETAILS OF FIVE ALGORITHMS IN PASF 112 5.1 Introduction 112 5.2 Connector Algorithm 112 5.3 Algorithm to Check the Legality of a Flowchart 123 5.4 Algorithm to Check Smoothness of a Flowchart 125 5.4.1 Introduction 125 5.4.2 Problems Due to Differences in the Web Grammar and List Structure Representations of Flowcharts. ... 128 vii LIST OF CONTENTS (Continued) Chapter Page 5.4.3 Marking the J's in the List Structure 129 5.4.4 Brief Description of the Smooth Checker 131 5.4.5 Details of SRI Routine 133 5.4.6 Details of the Routine SR 140 5.4.7 Termination of the Smooth Checker 158 5.5 Regular Expression Representation Generator 163 5.6 Algorithm for Translator Check 172 Six CONCLUSIONS AND FUTURE RESEARCH 175 6.1 Conclusions 175 6.2 Future Research 183 BIBLIOGRAPHY 186 APPENDIX 192 VITA 206 Vlll LIST OF FIGURES Fig. Page 2-1. The Three Possible Constructs in a Smooth Flowchart 13 2-2. Web A 17 2-3. Smooth Flowchart A 18 2-4. Three Entities of the GSE Model 32 3-1. Copy of PLATO IV Screen Showing Input Options 39 3-2. Chart of Options to Touch 40 3-3. A Node in the List Structure 43 3-4. INFO is made up of 4 Subfields 43 3-5. Arrangement of Boxes on PLATO Screen 45 3-6. Copy of PLATO IV Screen Displaying Maryjane's Algorithm "Intdivide" 49 3-7. Equivalence of Smooth Flowcharts and Regular Expression Representation 52 3-8. RE Form of Smooth Flowchart C 53 3-9. A Smooth Flowchart D with Its RE Form 54 3-10. RE Form of Integer Divide Algorithm of Fig. 3-6 55 3-11. RE Form for Integer Divide to Demonstrate Loop Heuristic, i.e., xl Changes inside the Loop 56 3-12. Example in Which Loop Heuristic Passes, but the Flowchart has an Infinite Loop at Execution Time 57 3-13. Procedure Call Analogy of SRE Form 61 r IX LIST OF FIGURES (Continued) Fig. Page 3-14. Block Diagram of Deduction Scheme 62 3-15. Two Trivial Flowcharts to Demonstrate SELECTOR 64 3-16. REs and SREs of Flowcharts in Fig. 3-15 65 3-17. Example of MATCHOR 69 3-18. Example with English Phrases in Diamond 70 3-19. The Goal of REDUCOR: To "Reduce" both SREs to One P 73 3-20. Comparison of a P and Its Black Box Representation 73 3-21. Case Illustrated 74 3-22. Combined Black Box for Case 74 3-23. Case 1 Illustrated 75 3-24. Combined Black Box for Case 1 75 3-25. Case 2 Illustrated 76 3-26. Combined Black Box P3 for Case 2 76 3-27. Maybe-Output Variables Arise in Loops and IFTHENELSEs 77 3-28. Flowchart Segment Illustrating Case 3 78 3-29. Combined Black Box B6 for Case 3 79 3-30. Case 4 Illustrated 79 3-31. Combined Black Box P7 for Case 4 80 3-32. Case 5 Illustrated 80 : = $$ for ovals, diamonds, and rectangular-shaped boxes respectively. : = start | stop : = : = | : = | : = «- : = «- : = read : = print : = - | f \ < \ > | < > : = : = + - | x !.. t : = $$ up to TO characters maximum : = : = | :=a b|c ••• | y z := | 1 | 2 | 3 | 4 5 | 6 | 7 | 8 | 9 47 : = : = : = : = + • • Briefly, SIM allows 1. "Start" or "stop" in oval boxes, 2. Expressions like "A > 10.0" in diamonds, and 3. "Read cat," "print dog," and "dog «- -12 t cat" in rectangles. Notice that there are no IF, GOTO, or DO statements. The control structure is the smooth flowcharts. Smooth flowcharts allow DOWHILE and IFTHENELSE constructs. To simplify the programming domain for the student (this is the student's first or second week in the course), there are no FORMATS for input/output, no arrays, and no procedure calls. Data types other than real variables and constants are not allowed. The hope is to have a simple programming language which the introductory nonmajor student can master quickly. For example, this language does not have a hierarchy of operators in expressions, which intro- ductory students often find confusing. In SIM, the student must use several statements to accomplish the same expression. This does lead to inefficient use of temporary variables, but this is not considered detrimental , since the main concern is to have the student write a correct solution and not to achieve efficiency. Use of expressions with boolean operations, e.g., "a = b AND 48 z = 10" is also not allowed. This is not a drawback, for the "AND" can be handled by the control structure. One rationale for a programming language with short statements is that the space inside the boxes (maximum of 16 char- acters) is limited. To advanced programmers, SIM may appear too restrictive, but for the class of problems and the anticipated au- dience, it is adequate. 3.5.8 Intdivide--a Sample Flowchart Fig. 3-6 is an example of a ten-box flowchart of an algo- rithm which does an integer divide by successive subtraction. 3.5.9 English Phrases inside Boxes The domain of the allowable student programs is any smooth flowchart of twenty-one boxes or less containing SIM. One other feature, English phrases, is also in the domain. The in- tended use of PASF is to teach step-wise refinement; therefore, English phrases must be allowed inside diamonds and rectangles. Since PASF needs to distinguish between SIM and English phrases, the student specifies whenever he/she wants to type in an English phrase. The English phrase must match what the instructor has de- signed to be permitted for a specific exercise. A discussion of this restricted natural language feature and how it is implemented will appear in Chapter Four. 49 student -mary jane Algorithm- intdi vide C start ) 1 f quoc4=0 read xl read x2 pr i nt quot pr i nt x 1 quot *quot + 1 xl^xl -x2 <" stop ") draw connect text erase special touch cot ion wanted Fig 3-6. Copy of PLATO IV Screen Displaying Maryjane's Algorithm "Intdivide" 50 3.5.10 Summary of Input Section of PASF The input section of PASF allows a student, using the touch panel, to "draw" a flowchart of up to twenty-one boxes. In- side the boxes a student may type English phrases or programming code from SIM. The student is given feedback on illegal operations and incorrect syntax as soon as they occur. When the student is ready to have his/her flowchart graded, he/she asks for PASF to check it. 3.6 Checking Student-Generated Flowcharts Since the students are beginners in programming, the checking part of PASF expects student-generated errors. PASF's goal is to find these errors and report them quickly to the stu- dent. A top-down approach searching first for gross errors, then for subtler errors is used. Since, typically, finding gross errors requires less computation, this hierarchical approach burdens the machine less. When an error is found, it is reported to the stu- dent, who must then make the necessary correction. 3.6.1 Check for Legal Flowchart The first check determines whether the student's array of boxes and arrows is a legal flowchart. A legal flowchart consists of one "start," at least one "stop," and all boxes reachable from "start" by traversing the flowchart. A further constraint is that 51 there must be text in every box. The details of the algorithm to check for a legal flowchart, as well as other pertinent algorithms, are covered in Chapter Five. If the flowchart is legal, the student is told that it is a good flowchart, and he/she should press a key to continue checking. 3.6.2 Check for Legal Smooth Flowchart A basic assumption for PASF is that the student's flow- chart is also a smooth flowchart. The next step is an algorithm to check for smoothness. The author's algorithm SR to check for smoothness is given in detail in Chapter Five. A student requires more feedback than the fact that his/her flowchart is not smooth; he/she wants to know where and why. If unsmooth, algorithm SR (cf. Section 5.4) tells the student the offending diamond box, the of- fending arrow (if the arrow can be determined), and the reason why it is not smooth. As stated before, if a student is following the step-wise refinement procedure, he/she cannot generate a nonsmooth flowchart. In the exercises tried, almost all student flowcharts are smooth. If the flowchart is smooth, a message appears inform- ing the student and tells him/her to press a key to go on. 3.6.3 General Heuristic Checks The next portion of PASF performs a series of heuristic checks on the student's flowchart. Many students repeatedly make the same types of errors. These heuristic routines attempt to 52 discover common but easily detected errors. For example, undefined variables are common student errors. A simple routine finds var- iables that the student has forgotten to initial lize or has mis- spelled. Many heuristic checks are conceptually simple, but non- trivial to execute on a list structure representation. This dif- ficulty initiated a search by the author for an easily manipulated representation of the flowchart. 3.6.4 Regular Expression Representation of Smooth Flowchart The discovery that a smooth flowchart can be represented as a string allows one to execute many heuristic checks efficiently. Each one of the three constructs of smooth flowcharts shown below can be represented by a string closely akin to regular expressions (RE) i. T concatenation DOWHILE loop IFTHENELSE ab (d) * (e + f) Fig. 3-7. Equivalence of Smooth Flowcharts and Regular Expression Representation 53 Two rectangles are equivalent to the concatenation of two strings. The DOWHILE, which means to do everything inside the loop zero or more times, is of the form of the Kleene star. The IFTHENELSE is an operator of an "OR" on processes. A few conven- tions need to be agreed upon before the RE form can completely re- present a smooth flowchart. Processes executed first are on the left. A "start" is on the far left of the string and a "stop" is on the far right of the string. Since a smooth flowchart always has one "start" and one "stop," one does not need to write them. From this point, the "start" on the left end and the "stop" on the right end will be understood. Since es/ery diamond corresponds to a d (a) * [a] b RE Form of C Smooth flowchart C Fig. 3-8. RE Form of Smooth Flowchart C 54 part of a DOWHILE or an IFTHENELSE, the condition inside the diamond will be attached to the corresponding operator "*" or "+" and en- closed in "[]s." Associated with a diamond box will be a "T" or an "F" to determine which arrow to follow, depending on whether the condition is true or false. For an IFTHENELSE, the true side is defined to be on the left of the "+" in the RE form. For the DOWHILE, the true side is defined to be the branch that loops. If the false side in a flowchart loops, this will necessitate interchanging the "T" and "F" sides by negating the condition. ab (e(f + [3] g ) h) *[-. a ] cd Fig. 3-9. A Smooth Flowchart D with Its RE Form 55 Embedded parentheses act in the usual manner as in arith- metic expressions. Below is the RE form of the integer divide smooth flowchart in Fig. 3-6. quot «■ read xl read x2(quot «- quot + 1 xl «■ xl - x2)* [xl - x2] print quot print xl Fig. 3-10. RE Form of Integer Divide Algorithm of Fig. 3-6. 3.6.5 Usefulness of RE Representation of Smooth Flowcharts How is this new representation more suited to PASF's needs? In a list structure, one must traverse; in a string, one needs only to scan. Returning to the question of a routine to de- termine whether any variables are undefined: the RE representation makes this trivial. For each variable var from the symbol table, one needs to scan for "read var" and "var «-." This routine is not only faster to execute than the list structure counterpart, but also uses less storage, is easier to program, and is easier to verify. As noted in Chapter One, the programming language TUTOR does not have pointers, stacks, recursive procedures, or other list processing features. Any list processing is programmed at a prim- itive level similar to programming in FORTRAN; any nontrivial rou- tine using list processing is a major effort in programming and verification. On the other hand, TUTOR does have efficient commands to scan strings of characters. These considerations make the RE representation especially suitable. 56 Before any heuristic checks are performed, the RE re- presentation of the student's smooth flowchart is computed. This computation is done by a slightly modified version of algorithm SR, which checks whether a student's flowchart is smooth or not (cf. Section 5.5). Besides the check to determine whether the variables are undefined, a series of other heuristic checks are done. One more check will be discussed in the next section. 3.6.6 Heuristic Check for Infinite Loops For all DOWHILE loops, at least one variable in the con- dition of the diamond must change inside the loop. If this is not true, the flowchart contains an infinite loop. To further illus- trate the utility of the RE form, the following process describes this algorithm: quot <- read xl read x2(quote +■ quot + 1 xl «- xl - x2)*[xl > x2] print quot print xl '* k left end of loop loop "*" loop variables Fig. 3-11. RE Form for Integer Divide to Demonstrate Loop Heuristic, i.e., xl Changes inside the Loop First, the program scans for a "*" (used to distinguish loop from "x" which indicates multiplication). Next, the program determines the variables (or loop variables) in the condition by looking a few characters ahead of the "■*." The program goes char- acter by character left from the "*" with a pointer, incrementing a 57 counter by one for every ")" and decrementing a counter by one for every "(•" The scanner is at the left end of the loop if one en- counters a "(" and the count equals zero. The program scans from this "(" to "*," looking for the loop variables to be reassigned. Only one loop variable must be reassigned inside the loop for the heu- ristic check to pass the test. This is done for all remaining "*"s. This heuristic check does not guarantee that the variable will actually be changed when the program is executed. In the following example, the heuristic test will pass, but during execu- tion "b" is always "1" and "a +■ a + 1" is never done. Therefore, the flowchart in Fig. 3-12 loops forever. The heuristic is unable a *• b «- 1( (c«-l + [b=l]a-a + l) )* [a < 10] print c Fig. 3-12. Example in Which Loop Heuristic Passes, but the Flow- chart has an Infinite Loop at Execution Time 58 to detect this case, since it assumes every control (physical) path is executed. Every control path is not an execution path, as is shown in the above example. Testing all execution paths is im- practical [Krause, Smith, and Goodwin, 1973]. Furthermore, the set of execution paths for a program will change depending on the data. Run-time dependent analysis of programs is extremely difficult and not attempted in this thesis. The new trend in programming language design is to allow the com- piler to do as much optimization, error detection, and verifying as possible. This was a major design criterion in Wirth's computer language PASCAL [Wirth, 1975]. Run- time data dependencies, by def- inition, can never be done by a compiler. Such dependency checks, e.g., subscript out of range, done at run time are expensive. After the student's flowchart passes all these heuristic checks, he/she is told, "Fine so far. Now I'll look to see if your flowchart is close to algorithm ' intdivide' ." 3.6.7 Algorithm Specific Heuristic Checks of PASF The previous section of PASF performs heuristic tests suitable for any one of a large class of algorithms. This portion performs algorithm-specific heuristic tests with the necessary in- formation inputted by the instructor. Again, the RE representation of the student's flowchart is used for efficient execution of the tests. The student's flowchart is checked for the correct number of "reads," "prints," DOWHILEs, IFTHENELSEs, English phrases, and 59 other features. If the instructor has indicated that a correct solution should have at least two "read"s, the student is told he/ she is missing a "read" (if his/her flowchart contains only one "read"). The student is forced to fix any errors the heuristic tests discover. Finding the number of loops is trivial for PASF: one line of TUTOR code (SEARCH command) is all that is needed to scan the RE representation. These algorithm-specific heuristics are assurance to PASF before initiating the deduction scheme that the student's attempt at the algorithm is reasonably close to the correct answer. This next and last phase of PASF, the deduction scheme, is expensive in terms of time and computation. Up to this point in the program, PASF's response to a student's input has been fast, taking less than a second of real time. The deduction scheme may take as long as sixty seconds of real time. 3.7 The Deduction Scheme of PASF The deduction scheme of PASF attempts to show that the student's flowchart is global semantic equivalent (GSE) (cf. Section 2.4.1) to the instructor's flowchart. PASF's goal here is to detect errors and their location as well as to tell the student that his/her flowchart is not correct. 3.7.1 Standard Regular Expression Representation (SRE) of Smooth Flowchart Before the deduction begins on the two flowcharts, they 60 are transformed into a new representation. The instructor's flow- chart, stored in the RE form, is retrieved from read-only-memory (TUTOR Common); the student's flowchart is already in the RE form. This new representation preserves the RE form, but translates each statement, e.g., "a «- a + 1" into a standard syntax. This new re- presentation is called the Standard Regular Expression (SRE) re- presentation. In SRE each statement is of the form: ; The possible types of statements in SIM and their new syntax is shown below: Table 2 Transformation from SIM to Standard Regular Expression (SRE) Form Statement Type in SIM SRE Form a «- 1 int 1 : a; b «- c set c: b; d «- e + f add ef : d a +■ b - c sub be: a d + exf mul ef: d a •*- b r c div be: a print a out a: ; read b inp si si s2 s2 S3 s3 s4 s4 61 In all the above SRE forms, the sequence number, the use of which will be apparent later, is zero. The si, s2, s3, and s4 are codes for distinct English phrases. These codes were determined by the input routine when the student typed in the English phrase. The order on the list of input variables and constants is the same order as appears in the assignment statement, e.g., "d «- b - c" is transformed to "sub be: d;", not "sub cb: d;." The second class of output variables—possible output variables (called maybe-output variables in this thesis)--arises with DOWHILEs and IFTHENELSEs. The "a «- b" inside a loop may never be executed, since that path may not be taken. In this case, "a" is a "maybe-output variable." A similar situation arises with IFTHENELSEs. In the author's search for a standard format, the ra- tionale for this format is that it mimics a procedure call. Every statement is treated as if it were a procedure call with the input and output variables clearly distinguished. (This is similar to the computer language JOVIAL, which uses ":" to distinguish the input and output variables in a procedure call.) c <-a + b call add(a,b,c) add ab: c ; Statement Procedure Call SRE Form Fig. 3-13. Procedure Call Analogy of SRE Form 62 Below is the SRE representation for the integer divide example given in Fig. 3-6. int 0:quot; inp :xl ; inp :x2; (add quot l:quot;sub xl x2:xl;)* [xl > x2] out quot:; out Oxl:; The above representation may be hard to read for humans (spaces have been added for easier reading in this text), but it is ideal for the machine. The actual representation in the machine contains numbers from to 63 for the different types and pointers as symbol table references. The deduction scheme scans and manipulates the SRE re- presentations of the two flowcharts. 3.7.2 Block Diagram of Deduction Scheme r start Below is a block diagram of the deduction scheme, new student alternative i v v OK SELECTOR forim d a Semantic Model Push * MATCHOR HOK REDUCOR OK TRANSLATOR R TRANSLATOR B Fig. 3-14. Block Diagram of Deduction Scheme 63 After the student's and instructor's flowcharts are in SRE form, control of PASF is given to the "SELECTOR." The SELECTOR selects portions of the two flowcharts as candidates for being Global Semantic Equivalent (GSE). The SELECTOR calls the "MATCHOR" to test for a match between the two portions. When the SELECTOR has produced GSE candidates for all of the two flowcharts, a Semantic Model is generated. Program control is passed to the "REDUCOR" to search the Semantic Model for incon- sistencies. If the REDUCOR finds no inconsistencies, the student's flowchart is GSE to the instructor's, i.e., it is correct. In certain cases, the REDUCOR calls the "TRANSLATOR R" to perform translations on the student's flowchart. After such translations, control is returned to the REDUCOR. In cases where the MATCHOR, REDUCOR, or TRANSLATOR R "fail," a backtrack is initiated to gen- erate another Semantic Model. In a backtrack, "TRANSLATOR B" modifies the student's SRE representation before control passes to the SELECTOR. In the following sections, each portion of PASF will be discussed in detail. 3.7.3 The SELECTOR of Deduction Scheme It is the task of the SELECTOR, with the help of the MATCHOR, to construct Semantic Models for the REDUCOR. The SELECTOR selects statements from the student's SRE and statements from the 64 instructor's SRE as candidates which may be global semantic equiv- alent (GSE). 11 Consider the two trivial flowcharts below: start a «- 1 I b «- 2 I c •*■ a - b I print c I stop Instructor's flowchart start J. dog «- 1.0 cat +■ 2.0 I rat-Klog-cat print rat I stop Student's flowchart Fig. 3-15. Two Trivial Flowcharts to Demonstrate SELECTOR First, PASF computes the RE and SRE representations, Student's RE dogs <- 1.0 cat +■ 2.0 rat «- dog - cat print rat Instructor's RE a <- 1 b +- 2 c-f-a-b print c !1 For discussion of Global Semantic Equivalent, see Section 2.4.1. 65 Student's SRE int 1.0: dog; int 2.0: cat; sub dog cat: rat; out rat: ; Instructor's SRE int 1: a; int 2:b; sub ab: c; out c: ; Fig. 3-16. REs and SREs of Flowcharts in Fig. 3-15 The SELECTOR, which is statement-type oriented, scans the student's SRE for "int" and finds the first statement. It searches for and finds an "int" in the instructor's SRE. Finding one in both, the SELECTOR calls the MATCHOR to determine whether the two are locally consistent. In this case the MATCHOR checks to see whether the two constants are equal numerically. Since "1" has the same value as "1.0," the MATCHOR returns an "OK" (cf. Fig. 3-14). Upon receiving the "OK" from the MATCHOR, the SELECTOR as- sumes that the two "ints" are GSE and constructs part of the Se- mantic Model by replacing the "ints" with two "P"s (for Process or Program). Student's SRE PI 1.0: dog; int 2 cat: ; sub dog cat: rat; out rat: ; Instructor's SRE PI 1: a; int 02 : b; sub a b: c; out c: ; The two P's are given the same sequence number (in this case, "1"). The significance of having any two P's with the same se- quence number is that the student's Pi and the instructor's Pi are semantically equivalent, in the programming language sense. They 66 may be GSE, depending on the rest of the SREs. At the moment, they are assumed to be GSE until proven otherwise. The SELECTOR looks for another "int" and finds an "int" in both SREs. The MATCHOR gives an OK, and the SELECTOR assumes the two are GSE. Student's SRE PI 1.0: dog; P2 2.0:cat; sub dog cat: rat; out rat: ; Instructor's SRE PI 1.0: a; P2 2: b; sub a b: c; out c: ; The SELECTOR searches for "int" and finds none. Search- ing through all the statement types, the SELECTOR eventually tries the "out." v Student's SRE PI 1.0: dog; P2 2.0: cat; sub dog cat: rat; P3 rat: ; Instructor's SRE PI 1: a; P2 2: b; sub a b: c; P3 c: ; The SELECTOR searches for "sub" and calls MATCHOR. Student's SRE PI 1.0: dog; P2 2.0: cat; P4 dog cat: rat; P3 rat: ; Instructor's SRE PI 1: a; P2 2: b; P4 a b: c; P3 c: ; Since the SELECTOR finds no more statements left with statement- types, it is finished for the time being. The above two SREs constitute the Semantic Model. All the individual pieces of the two flowcharts have been shown to be semantical ly equivalent. Matching individual pieces does not prove that the pieces fit 67 together properly. Fitting the pieces together is the task of the REDUCOR. The SELECTOR has other tasks besides generating the Se- mantic Models. In the process of searching the SRE, the SELECTOR finds all possible alternatives for the candidates to become a P. It pushes the different alternatives along with the current state of the SREs into a stack. Later, TRANSLATOR B will pop this stack in the process of a backtrack. In the example above, the SELECTOR notes the fact that there are two "int"s in each SRE and pushes this fact into the stack. An important task for the SELECTOR is handling DOWHILE loops and IFTHENELSEs. The SELECTOR finds the first inmost pair of parentheses in both SREs and tries to show that what is inside the parentheses is GSE. In the SREs below, a loop is imbedded in- side an IFTHENELSE: Student's SRE ( + ( ) * ) sb se Instructor's SRE - — ( + ( ) * ) t t ib ie The SELECTOR searches for statement-types between the pointers sb and se and the pointers ib and ie; calls the MATCHOR and forms Ps as before; forms a subsemantic model of Ps; and asks the REDUCOR to prove that what is between sb and se is GSE to ib and ie. If the REDUCOR is successful, one P remains inside each 68 pointer, and the SELECTOR calls the MATCHOR to check whether the conditions in the diamond boxes, e.g., "a ^ 100," match. If both the REDUCOR and the MATCHOR are satisfied, the SELECTOR reduces the loop of each to a single P. An IFTHENELSE is handled in a similar way. The detailed mechanics of the reduction will be covered later. In the process of this reduction, the pair of parentheses is eliminated. Now the SELECTOR scans the total SREs to find the next inmost pair of parentheses and continues in like manner until all parentheses are removed from the SREs. A final task of the SELECTOR is to function as the ex- ecutive: it is the master that calls the other routines. 3.7.4 The MATCHOR of PASF The MATCHOR guarantees for the SELECTOR that a piece of the student's flowchart is semantically equivalent to a piece of the instructor's flowchart. If the MATCHOR cannot guarantee this fact, it forces a backtrack and passes control to TRANSLATOR B. Some pieces of the flowcharts that the MATCHOR handles are single statements. Below are two pieces of program passed by the SELECTOR for the MATCHOR to interrogate: Student add cat 1.0: dog; Instructor add la: b; The SELECTOR has verified only that both are "adds"; the MATCHOR must verify that they are both the same form. (Possible forms for an "add" are "add al : c;," "add 2b : e;," "add 12 14: d;," and "add gh : j;," which represent "c «■ a + 1," 69 "e * 2 + b," "d * 12 + 14," and "j ♦ g + h" respectively.) In the above example, they are not of the same form, but the MATCHOR no- tices that the student's can be changed to the same form as the in- structor's. The MATCHOR changes the SRE to Student's add 1.0 cat: dog; since addition is commutative. Before the above change is made, the MATCHOR checks the values of the constants to see whether they are numerically equal. If the constants are not numerically equal, if the forms are not the same, or if the student's form cannot be changed to the same, the MATCHOR forces a backtrack. Besides matching statements, the MATCHOR handles the con- ditions inside diamond boxes. When the SELECTOR finds a loop or an IFTHENELSE, it calls the MATCHOR to match the conditions. The MATCHOR checks the form and the values of any constant. The ex- ample below of two flowchart segments demonstrates the task of the MATCHOR. Student's flowchart segment Instructor's flowchart segment Fig. 3-17. Example for MATCHOR 70 First, all loops are forced to be taken by the "T" side in the RE form. Since the student's flowchart segment has a DOWHILE which loops on "F," the condition inside the diamond is negated. Student's SRE (a) * [y < +100.0] Instructor's SRE (b) * [100 > z] Second, the MATCHOR finds the constants are numerically equal. Third, the MATCHOR knows that "c < d" is semantical ly equivalent to "d > c." The MATCHOR finds that the two conditions are semantically equivalent and changes the student's SRE to the following to conform to the instructor's: Student's SRE (a) * [+100.0 > y] The MATCHOR must also handle English phrases in diamonds. Student's flowchart segment Instructor's flowchart segment Fig. 3-18. Example with English Phrases in Diamond The input routine accepts both English phrases and gives them identical names of, for example, "si." (Of course, this de- pends on the way the instructor has set up the problem.) When the 71 RE form is generated, the condition "dishes done?" is negated. The conditions are not semantical ly equivalent. Student's SRE Form (c) * [-> si] Instructor's SRE form (c) * [si] In this case, the MATCHOR must not only check to see whether i = j of si and sj, but also determine whether they are both negative or not. When the MATCHOR finds a mismatch in a condition of a diamond box, a message of a possible error is given to the student. The MATCHOR 1 s task is to try all possible equivalent ways of determining whether two small pieces of a program are semanti- cally equivalent. It may alter the student's SRE slightly if the MATCHOR discovers that the two pieces can then be made semantical - ly equivalent. The MATCHOR's scope of activities is kept to a local level. 3.7.5 The REDUCOR of PASF The REDUCOR is a crucial part of PASF, since its task is to find an inconsistency in the Semantic Model. The SELECTOR, with the help of the MATCHOR, has generated the Semantic Model. The Se- mantic Model consists of the student's and the instructor's SRE re- presentations containing all P's. Each individual piece of the student's SRE is semantical ly equivalent to an individual piece of the instructor's SRE. For the student's total flowchart to be Global Semantic Equivalent (GSE) to the instructor's total 72 flowchart, each individual piece must be GSE as well as semantical ly equivalent. During one reduction step, the REDUCOR reduces two ad- jacent P's in both SREs to one P in both SREs. Assuming the two P's are GSE to the other two P's and certain conditions hold, the REDUCOR forms one P which is assumed to be GSE to the other new P. The diagram below demonstrates one "reduction" step. P3 and P4 have been "reduced" to P5. Student's SRE PI P2 P4 P3 V P5 P5 /\ P4 P3 Instructor's SRE PI P2 Student's New SRE PI P2 P5 (under certain conditions) ( Instructor's New SRE PI P2 P5 The goal of PASF's REDUCOR is to "reduce" both SREs, after many steps, to one P. If this can be accomplished, the two flowcharts are GSE and the student is told his/her flowchart is correct. 73 Student's SRE successive steps > in reduction P13 J P13 > Instructor's SRE Fig. 3-19. The Goal of REDUCOR: To "Reduce" both SREs to One P A P can be treated as a black box with inputs and outputs The inputs, outputs, and maybe-outputs are ordered with the top first. P * W M a a "P' Black Box Representation of a "P' Fig. 3-20. Comparison of a P and Its Black Box Representation 74 Reducing two adjacent P's is analogous to combining two black boxes. Seven cases need to be considered in combining the two black boxes. Below is the first case showing how two black boxes can be combined. CASE 0: P5 alb:cd;ef P6 2g:jkl;q P5 ->c ■>d =>e *f Fig. 3-21. Case Illustrated Since no input variable, output variable, or maybe-output variable of P5 occurs in P6, the input variables, output variables, and maybe-output variables of the larger P7 are the input variables, output variables, and maybe-output variables of P5 and P6, as shown in Fig. 3-22. Fig. 3-22. Combined Black Box for Case 75 Case 1 includes an output variable of the first black box which is an input variable of the second black box. CASE 1: PI 1.0: dog; P2 dog 2: cat ; 1.0 PI dog Fig. 3-23. Case 1 Illustrated The black box on the left (PI) always starts and finishes its computation before the black box on the right (P2) starts. Fig. 3-24. Combined Black Box for Case 1 The two black boxes can be placed into a bigger black box, P3, by giving P3 the proper inputs and outputs. Since "dog" is an output 76 of PI and an input of P2, it is not an input for the bigger black box P3. The list of inputs of P3 include first the inputs of PI and then the inputs of P2 (minus "dog"). Similarly, the outputs of P3 are the outputs of PI and P2, with Pi's outputs first. This example shows that an input of the second P can cease to be an in- put for the bigger box if an output of the first P has the same name. Case 2 has the same output variable in both P's. CASE 2: PI 3:a; P2 4:a; (e.g., a +-3 a <- 4) PI Fig. 3-25. Case 2 Illustrated Combining these two black boxes reveals that the output variable of PI is not used as an output variable of P3. 3 » a t 4 , P3 *• 3 PI a 4 P2 a r Fig. 3-26. Combined Black Box P3 for Case 2 77 Because P2 is executed after PI stops, the value of "a" from PI is destroyed. The maybe-output variables arise because of IFTHENELSEs and DOWHILE loops. The following two flowchart segments illustrate the two places where maybe-output variables can occur. 1 r c = 1 1 f a «- 2 b <- 3 V Fig. 3-27. Maybe-Output Variables Arise in Loops and IFTHENELSEs The variable "a" is a maybe-output variable, for the "T" branch may never be taken. Similarly, "g" is a maybe-output var- iable, for the loop may never be taken. The SELECTOR handles the reduction of an IFTHENELSE or a loop if inside there is only two P's or one P, respectively. < Pa V °a ; M a + [X op Y] Pb V °b ; V is reduced to the following, assuming all conditions are met: Pc XY II.: ; ' ' MM. a b c a b a b 78 where is a list of output variables which appears in both and c a k , 0,' is 0, minus . and 0. ' is 0. minus 0„. baa ebb c Operating similarly for loops, the SELECTOR reduces the following (Pa I a : a ; M fl ) * [X op Y] to PcXY I : ; a M Q assuming all conditions are met. In both the loop and IFTHENELSE reductions, maybe-output variables are generated. Discussion of l the occurrences of maybe- output variables in black boxes follows (Cases 3, 4, 5, and 6). The flowchart-segment below illustrates Case 3. CASE 3: -T-; P3 2 ! 3 > => Fig. 3-28. Flowchart Segment Illustrating Case 3 79 Combining the two black boxes, one must decide what to do with the maybe-output variable "a." c > — 2 » c i; P6 a 3_j" /\.r^\ a a >m P3 -{ MS 3>^ P5 ■ >■ I ? j IS , b ^a '$ Fig. 3-29. Combined Black Box P6 for Case 3 A Maybe-Switch (MS) is incorporated into the larger black box, P6, above. If the variable "a" inside P3 is an output var- iable, the MS switches the input line of P5 to output of P3; other- wise, the MS switches the input line of P5 to the input line of P6. The two P's below with "a «- 8" replacing the previous "print a" constitute an example of Case 4. P3 cl23: ; ab P6 8: a ; CASE 4: c n P3 ? % i . .\a j — » sh r u Fig. 3-30. Case 4 Illustrated 80 When the two black boxes are combined, the "a" of P6 'destroys" the "a" of P3. c , P7 1 9^~ c p a C i- 3 1- I ] P3 > P6 > -i 2 8, , 3' 'd b j . ^ Fig. 3-31. Combined Black Box P7 for Case 4 Reversing the two P's shown in Case 4 illustrates an example of Case 5. P6 8:a; P3 cl23:;ab CASE 5: 8 P6 Fig. 3-32. Case 5 Illustrated The fact that a variable is an output variable in either small black box makes that variable an output variable for the larger black box. 81 8 P7 c , J 1 1 , f i c i »■ 1 * MS *\ a 2 ( i. 0~ ► > 3 , ■\ s : - > ' > _-/ 1 -A_+ Ljl, P6 P3 2 ; 3 , 1 b "V t' Fig. 3-33. Combined Black Box for Case 5 In the last case, the two P's below with "a" are a maybe- output variable in both P's (replacing the "a <- 8" by another IFTHENELSE). CASE 6: P3 cl23: ; ab P7 c245: ; ad c P7 2 * 4 > c >>a — 2 — i\ =*d Fig. 3-34. Case 6 Illustrated 82 > a => d Fig. 3-35. Combined Black Box P9 for Case 6 An MS is incorporated to "destroy" the "a" of P3 if the ( "a" in P7 is an output variable. Seven cases have been considered in combining two smaller black boxes into a larger one. These seven cases will be used in the development of the rules for reducing two P's of the instructor's SRE with two P's of the student's SRE. Student's SRE Pa I : : M a Pb I.: 0. ; M. a a a b b b Instructor's SRE Pc I : ; M Pd I .: 0.; M. c c c d d d In the above two SREs if the sequence number "a" equals the sequence number "b," the two P's (Pa and Pc) are semantical ly equivalent; similarly, Pb and Pd are semantically equivalent if the sequence numbers b and d are equal. If Pc and Pd are adjacent, they may be combined by the black box technique; if Pa and Pb are adjacent, they likewise may be combined by the black box technique. If all these conditions exist, then an attempt can be made to 83 "reduce" Pa and Pb, and Pc and Pd. These conditions are stated as Rules 1 and 2. Rule 1 for Reduction : The first P of each set of P's to be reduced must have the same sequence number, e.g., a = c. Rule 2 for Reduction : The second P in the student's set of P's to be reduced must have the same sequence number as the second P in the instructor's SRE. If Rules 1 and 2 are satisfied, then two sets of two black boxes can be drawn. Rules 3 through H refer to Fig. 3-36. I Student Instructor — - — > Pd . > 1 V c °d M d Fig. 3-36. The Set of P's to be Reduced Not only are Pa and Pc semantically equivalent, but I a and I have the same number of inputs. Any constants in I appear in the same position in I , and the constants have the same numer- a ical value. The number of output variables in 0_ equals the number a 84 of output variables in ; a similar situation exists with M and c a M . Furthermore, the SELECTOR has guaranteed that these facts apply to the inputs and outputs of not only Pa and Pc, but also of Pb and Pd. The corresponding pieces, e.g., inputs, outputs, etc., before being combined have the same structure. When Pc and Pd are combined to Pf, the seven cases dis- cussed previously will be applied to form the inputs, outputs, and maybe-outputs. If, when Pa and Pb are combined to Pe the same in- ternal structure and corresponding inputs, outputs, and maybe- outputs occur as in Pf, then Pe and Pf can be formed. A "reduc- tion" takes place when Pc and Pd are combined to form Pf, and Pa and Pb are combined to form Pe. The next two rules result from Case 1, which states that an output variable of the first black box is internally "connected" to the same variable as an input to the second black box. Rule 3 checks for missing occurrences of Case 1 in the student's set of P's to be reduced. Rule 4 checks for extra occurrences of Case 1 in the student's set of P's to be reduced. Rule 3 for Reduction : Any output variable in that occurs in I, must also have a corresponding (by position) output variable in a a which is repeated in I. ; further, the position of the in- put variable in I. must be equal to the position of the input variable in I .. If cj - I dk then aj = I bk 85 Rule 4 for Reduction : There exist no output variables in occurring in I. and repeated as a corresponding output variable in which are not equal to a variable in I .. -3 aj = I bk such that cj f I dk The following two rules result from Case 2, where an out- put variable of the second black box can "destroy" the same output variable of the first black box. Rule 5 for Reduction : Any output variable in which is repeated in . must have a corresponding output variable in 0, which is re- a peated in 0. . The position of the output variables in . and 0. must be the same. If cj - dk then aj = bk Rule 6 for Reduction : There exist no output variables in occurring in 0. and repeated as a corresponding output variable in which are not equal to a variable in .. "■ 3 aj - bk such that cj f dk In considering Case 3, any Maybe-Switch is related to whether the maybe-output variable is actually changed or not inside the first P. A maybe-output variable is changed if a proper path through the loops and IFTHENELSEs is taken. If Pa and Pc are 86 semantical ly equivalent, then any path taken in Pc will have a cor- responding path taken in Pa; therefore, a Maybe-Switch related to Pc will be set identically to a corresponding Maybe-Switch related to Pa. The actual setting of a Maybe-Switch is unimportant; what is important is that if Pe has a Maybe-Switch for variable "a," then Pf must have a Maybe-Switch for the corresponding variable. The check for the occurrence of a Maybe-Switch is done by Rules 7 and 8, which are quite similar to Rules 3 and 4. Rule 7 for Reduction : Any maybe-output variable in M that is repeated in I . must also have a corresponding (by position) maybe-output variable in M which is repeated in I. . Further, the position of the input variable in I. must equal the posi- tion of the input variable in I .. If M . = I,, then M . = I. . cj dk aj bk Rule 8 for Reduction : There exist no maybe-output variables in M a occurring in a I. and repeated as corresponding output variables in M which are not equal to a variable in I .. -3 M aj ■ : bk such that M cj * ! dk Rules 9 and 10 for Case 4 are similar to Rules 5 and 6. The latter pair deal with output variables; the former pair deal with maybe-output variables. 87 Rule 9 for Reduction: Any output variable in . which is the same as a maybe- output variable in M must have a corresponding output variable in 0. which is the same as a maybe-output var- iable in M . The position of the maybe-output variables a in M and M must be equal, c a If dj - M ck then bj = H ak Rule 10 for Reduction: There exists no output variable in 0. occurring as maybe- output variable in M and repeated as corresponding out- a put variable in . which is not equal to a maybe-output variable in M . -, "2 0. • = M . such that . . f M . ^ —J bj ak dj ck For Case 5, a Maybe-Switch is incorporated which switches the output variable of the two smaller black boxes to the output variable of the larger black box. Rules 11 and 12 check for a cor- responding Maybe-Switch in both P's. Rule 11 for Reduction : Any output variable in which is repeated as a maybe- output variable in M, must have a corresponding output variable in 0_ which is repeated as a maybe-output var- a iable in M, . The position of the maybe-output variables in M. and M. must be the same. If cj . M dk then aj . M bk 88 Rule 12 for Reduction : There exist no output variables in occurring in M. and repeated as a corresponding output variable in which is not equal to the maybe-output variable in M .. - 3 °aj = M bk such that °cj * M dk For Case 6 the same argument about the Maybe-Switch holds true. If Pb and Pd are semantically equivalent, any Maybe-Switch related to Pb will be set identically to a corresponding Maybe- Switch related to Pd. Rules 13 and 14 are similar to Rules 5 and 6. ' Rule 13 for Reduction : Any maybe-output variable in M. which is repeated as a maybe-output variable in M must have corresponding maybe- output variable in M, which is repeated as a maybe-output variable in M . The position of the maybe-output var- a iables in M and M must be the same, c a If M.. = M . then M. . = M , dj ck bj ak Rule 14 for Reduction : There exist no maybe-output variables in M. occurring in M and repeated as a corresponding maybe-output variable a in M. which is not equal to the maybe-output variable in M c- - 3 M bj - M ak such thdt % * M ck 89 Rule 14 completes the set of rules which must be satisfied in order to do a reduction. The Rules 7, 8, 13, and 14 are stronger conditions than are required. In the situations when the variables are not changed, these rules need not be applied, but discovering such situations is very time-consuming or impossible (closely akin to finding all traces in a program). In many situations, the maybe-output var- iables are changed depending on run-time data; therefore, to cover the important situations even when it is not known which are the important ones, the author has decided to apply Rules 7, 8, 13, and 14 all the time. Inspecting the fourteen rules, one sees that some, e.g., Rules 3 and 7, resemble each other \/ery much. This fact is used in the actual implementation of PASF. The fourteen rules above are tests which are applied to the set of P's to be reduced before the reduction. If the set of P's to be reduced passes all fourteen tests, then the actual re- duction takes place as below, Student's SRE Pa I : ; M Pb I, : 0. ; M k a a a b b b Instructor's SRE Pc I : ; M Pd I .: 0.; M. c c c add Student's New SRE after Reduction Pe I I k ':0 ' 0. ; M ' M ' a b a b a b Instructor's New SRE after Reduction Pf I I ' : ' 0.; M ' M.' c d c d c d Fig. 3-37. A Reduction Step 90 where e = f (a new sequence number), I. ' and I .' are I. and I . minus the input variables mentioned in Rule 3, ' and ' are a c a and minus the output variables mentioned in Rule 5. M ' and M ' c a c are M a and M„ minus the maybe-output variables mentioned in Rule 9 a c and M. ' and M.' are M. and M. minus the maybe-output variables in Rule 11. These fourteen rules look like a lot of work, but, in practice, they are easy to apply and require only a modest amount of code; furthermore, many times the tests are trivial, since the P's may have neither maybe-output variables nor even output var- iables. The following example demonstrates the application of the fourteen rules and the reduction process. In Section 3.7.3, the SELECTOR generated the following Semantic Model as an example (cf. Fig. 3-16): Student's SRE PI 1.0: dog; P2 2.0: cat; P4 dog cat: rat; P3 rat: ; t t Pa Pb Instructor's SRE PI 1: a; P2 2:b; P4 ab: c; P3 c: ; t t Pc Pd The REDUCOR first tries to reduce the P with the highest sequence number—in this case, P4. Rule 1 is applied to find P4 in the student's SRE. Rule 2 is satisfied, since the P immediately following P4 in the student's SRE has the same number (P3) as the P immediately following P4 in the instructor's SRE. 91 Applying Rule 3, the REDUCOR finds that the output var- iable "c" of the instructor's P4 is an input variable in P3 of the instructor's. For Rule 3 to be satisfied, the first output var- iable of P4 of the student's must occur as the first input variable in P3 of the student's. In this case, "rat" is in the proper posi- tions. The application of Rules 4 through 14 reveals no complica- tions or difficulties. When all the fourteen rules have been satisfied, P3 and P4 of both can be reduced. First, "c" in P3 and "rat" in P4 are eliminated, a new sequence number not used elsewhere is given to the new P, and the input and output variables are combined as shown previously (cf. Fig. 3-37). Student's New SRE after One Reduction PI 1.0: dog; P2 2.0: cat; P5 dog cat: rat; Instructor's New SRE after One Reduction PI l:a; P2 2: b; P5 ab: c; Taking the SREs another reduction step, the REDUCOR searches for the highest number P (P5) and tries to reduce it. Since there is no P following P5, the REDUCOR searches for the next lowest numbered P (P2). Since P5 follows P2 in both the instructor's and the student's SREs, Rules 1 and 2 are satisfied. Rule 3 elim- inates the "b" in P5 and the "cat" in P5. Rules 4 through 14 are applied, but do not affect the P's. Student's SRE after Two Reductions PI 1.0 : dog; P6 2.0 dog : cat rat; Instructor's SRE after Two Reductions PI 1: a; P6 2a: be ; 92 After the third reduction, only one P is left in each of the SREs. Student's SRE after Three Reductions P7 1.0 2.0 : dog cat rat ; Instructor's SRE after Three Reductions P7 1 2 : abc ; If the student's SRE and the instructor's SRE are both reduced to one P, then the two flowcharts are GSE. The student is told his/her flowchart is correct. The fact that only constants are left as inputs in P7 is always true for this method. The completely reduced SRE of one P will always list all the constants and all the variables used in the program. If the test corresponding to Rule 2 fails, the REDUCOR calls TRANSLATOR R. If the test for Rules 3 through 14 fail, the REDUCOR stops and forces a backtrack. 3.7.6 The TRANSLATOR R of PASF If Rule 2 of the reduction fails, the REDUCOR calls TRANSLATOR R. TRANSLATOR R tries to move around the P's in the student's SRE to satisfy Rule 2. Rule 2 demands that the second P after Pi (the first P trying to be reduced) in the student's SRE have the same sequence number as the second P after Pi in the instructor's SRE. The sec- ond P will be called Pj. There are two possible cases for the Pj in the student's SRE in relation to Pi. The first case is to have other P's (Pc***Pz) between Pi and P j . 93 Student's SRE Pi Pc---Pz Pj Instructor's SRE Pi Pj There are two ways to move the student's Pi and Pj to- gether. The first is to interchange "Pc»»«Pz" and "Pj"; the sec- ond is to interchange "Pi" and "Pc*"Pz." 12 The first is tried; if it cannot be done, the second is tried. If either succeeds, TRANSLATOR R returns control to the REDUCOR to continue the reduc- tion; if neither succeeds, a backtrack is forced. The second case is to have Pj before Pi, possibly with several P's in between. Student's SRE Pj Pc-Pz Pi Instructor's SRE Pi Pj - Again, there are two ways to move the student's Pi Pj to- gether: the first is to interchange "Pc* #, Pz Pi" and "Pj"; the second is to interchange "PjPc*»Pz" and "Pi." If either succeeds, TRANSLATOR R returns control to the REDUCOR; if both fail, then a backtrack is forced. The conditions necessary in order to inter- change P's are covered in the next section. 3.7.7 Conditions Which Allow Interchanging of Portions of Student's SRE Considering A and B as arbitrary statement types (e.g., add Oab:c;), as arbitrary P's, or as arbitrary strings of P's and 12 There are special conditions which must be satisfied in order to perform such an interchange, but they will be covered in Section 3.7.7. 94 statement types, AB can be interchanged to form BA if the following three conditions are true: Condition 1 If no output variable or maybe-output variable of A is an input variable of B. Condition 2 If no output variable or maybe-output variable of B is an input variable of A. Condition 3 If no output variable or maybe-output variable of A is an output variable or maybe-output variable of B. Only these three conditions need be satisfied in order to move two strings of SREs around. Below are a few examples to demonstrate the three condi- tions. Example 1 The following cannot be interchanged, because "a" is an output variable of PI and an input variable of P2. (Con- dition 1 fails.) a «- 1 b «- a + 2 SRE PI 1: a; P2 a2: b; Example 2 The following cannot be interchanged. (Condition 3 fails.) a «- 12 a +■ 4 SRE PI 12: a ; P2 4: a ; 95 Example 3 (All three conditions pass.) read a b «- i *■ read d print d with X and Y as shown SRE inp : a; J>1 : b; P2 0: 1 ;, Jnp : d; out d: ^ with ~~ V ^' r ~ ' "i ' *' some P's X Y This interchange is possible since X and Y satisfy all three conditions. SRE (inp : d; out jd j ; y inp : a ; PI : b; P2 : \\, after -v " v nt ~~ inter- Y X change 3.7.8 The TRANSLATOR B of PASF • Any backtrack request is handled by TRANSLATOR B (cf. Fig. 3-14). If the MATCHOR, REDUCOR, or TRANSLATOR R "fail" in their tasks, they force a backtrack and pass control to TRANSLATOR B. TRANSLATOR B pops the stack, manipulates the student's SRE to try the next possible alternative, and passes control to the SELECTOR, which generates a new Semantic Model for the REDUCOR to check. In an attempt to pop an empty stack, TRANSLATOR B halts the deduction scheme and outputs the message to the student that his/ her flowchart is either not correct or not close enough to correct. In the process of generating the Semantic Model, the SELECTOR searches for all possible alternatives of GSE candidates. When the SELECTOR finds an alternative, it pushes into the stack 96 the current state of the student's SRE, the current state of the instructor's SRE, a few pointers, and the reason why there is an alternative. When TRANSLATOR B pops the stack, enough information is in the stack to restart the SELECTOR at the new state. This new state, which is a partial solution, may have a mixture of state- ment types, e.g., "add 1 a: a;," and "P's." In fact, the partial solution may be close to being finished, e.g., with a DOWHILE loop reduced to a single P. TRANSLATOR B, before passing control to the SELECTOR, must do two things. First, TRANSLATOR B makes a modification to the student's SRE which corresponds to the new alternative. For example, a simple case arises when the SELECTOR finds an "add" of the form "add b c: d;." An alternative is "add c b: d;," since addition is commutative. This fact is pushed into the stack. After popping the stack, TRANSLATOR B alters the student's "add" and the deduc- tion scheme tries this new SRE. The second task of the TRANSLATOR B is to set a flag used by the SELECTOR or MATCHOR to guarantee that the deduction scheme does not attempt an alternative tried previously. Without this feature, the deduction scheme loops forever, pushing and popping on the same alternative. A common alternative pushed by the SELECTOR is multiple statement types. The example below contains multiple "int s" ("a *■ 1" form). Student's SRE int 1: zap; PI P2 int 1: cow; Instructor's SRE int 1: a; P2 PI int 1: b; 97 There is no way to tell whether "int 1: zap;" is GSE to "int 1: a;" or to "int 1: b;." Both ways must be tried. The original way is tried first, and then the second way is pushed into the stack. After popping the stack, TRANSLATOR B tries to interchange "int 1: zap;" with "int 1: cow;." In general, there will be statement types or P's in between. Section 3.7.7 gives three conditions necessary to interchange AB to BA, but in this case XZY to YZX is needed. The three rules for interchanging AB to Ba can be used for XZY to YZX, but they must be applied twi ce . First Application XZJ^ ^ZJX U t t of Interchange Rules A B B A Second Application ZYX YZX M M of Interchange Rules A' B' B' A' TRANSLATOR B checks these six conditions before attempt- ing the move. The checks may fail for several reasons: Z and Y may be dependent and cannot be interchanged, X and Z may be depend- ent and cannot be interchanged, or other reasons. If Z and Y are dependent, the TRANSLATOR B tries to move XZY to ZYX. (Placing the X after the Y forms a different alternative as the SELECTOR scans from left to right.) If X and Z are dependent, the TRANSLATOR B tries to move XZY to YXZ. If all three attempts to interchange fail, TRANSLATOR B forces a backtrack and calls itself to pop an- other item from the stack. The TRANSLATOR B is called when the deduction scheme 98 backtracks to the last partial solution to try an alternative branch. After popping the stack, it modifies the student's SRE to allow the SELECTOR to continue. If the stack is empty, the de- duction scheme halts, since all possible solutions are exhausted. 3.7.9 Error Detection in the Deduction Scheme Below is the diagram of the deduction scheme (cf. Fig. 3-14) with the error detection portions added. a new student alternative start SELECTOR formed Semanti push a c Model REDUCOR Analyze Error Stack Fig. 3-38. Block Diagram of Deduction Scheme with Error Detection Portions Situations can arise which the SELECTOR cannot resolve. The SELECTOR halts the deduction scheme and outputs an error mes- sage to the student. (Finding a loop in the student's flowchart 99 when the instructor has an IFTHENELSE is an example of one of these situations. ) The MATCHOR can detect gross errors or inconsistencies between the student's and the instructor's flowcharts. In such cases, the MATCHOR halts the deduction scheme and outputs an error message to the student. (The unmatched conditions in a diamond may cause this, e.g., "a > 1" in the student's diamond and "b = 100" in the instructor's diamond.) Whenever a backtrack occurs, the reason why it occurred and other pertinent facts are pushed into an Error Stack. When the TRANSLATOR B attempts to pop an empty Main Stack, the deduc- tion scheme is halted and the Error Stack is analyzed to give the student feedback on possible errors. For example, if a backtrack is forced many times by a statement in which the MATCHOR cannot match up a constant, a possible error is that the constant has the wrong value. 3.7.10 Summary of the Deduction Scheme of PASF The deduction scheme attempts to show that the student's flowchart is GSE to the instructor's flowchart. The deduction scheme can say that the student's flowchart is correct if it is reasonably close to the instructor's flowchart. If the deduction scheme cannot say that the student's flowchart is correct, it at- tempts to tell the student in intelligible language what is wrong. TOO The deduction scheme uses a minimum of storage. All changes by the SELECTOR, MATCHOR, REDUCOR, TRANSLATOR R, and TRANSLATOR B are done to a single copy of the student's and the instructor's SREs. An important result for the' stack manager is that an original SRE can never increase during the whole deduction scheme. A quick measure of the length of the original SRE de- termines the size necessary for an item of the stack. A smaller flowchart can have a much larger stack depth for the same physical amount of storage. The maximum storage allowed for an SRE is 15 words. Since there are two SREs and two status words, a 32-word item may have to be pushed. Typically, an SRE is 3 words, which means that a stack item is about 8 words. The current stack size is 100 words (expandable to 1000 words). The error stack uses only one word per item; its current size is 30, which allows 30 backtracks. The total storage for data per student required for the deduction scheme is 150 words of addressable storage for work variables (TUTOR student variables) and 130 words of unaddressable storage (TUTOR STORAGE) for the two stacks. (The unaddressable storage can be transferred in blocks to the work variables.) As noted before, the deduction scheme requires a fair amount of computing. Even though PASF is efficiently coded, a de- duction may take one CPU second of computing. Depending upon the load of PLATO, one CPU second may be sixty seconds of real time. To avoid the situation in which the student becomes overly frust- rated by long waits, PASF checks a real time clock at strategic 101 places in the deduction scheme (e.g., at a backtrack), suspends operation if real time has exceeded 30 seconds, and asks the stu- dent whether he/she wishes PASF to continue. If the student types, "No," PASF returns him/her to the input routine; otherwise, PASF continues the deduction scheme for another 30 seconds of real time. 3.8 Summary of PASF PASF allows a student to "draw" a flowchart on the PLATO screen and to have it graded. Part of the grading consists of checking to see whether it is a flowchart and, if it is, then to see whether it is a smooth flowchart. If it is a smooth flowchart, heuristic checks are performed to catch common student errors. Before the time-consuming deduction scheme is attempted, algorithm specific checks are performed to assure that the flowchart is close to the right answer. After passing these checks, the student's flowchart is compared with a correct flowchart inputted by the instructor. The deduction scheme attempts to show that the student's flowchart is correct by proving it is GSE to the instructor's flow- chart. At all levels of PASF, feedback on errors is given to the student. PASF is a viable program to grade student generated flow- charts in the PLATO computer-based education environment. 102 Chapter Four THE ROLE OF THE INSTRUCTOR IN PASF The Program to Analyze Smooth Flowcharts (PASF) is used by three classes of people—students, researchers, and instructors. The student's use of PASF is discussed in Chapter Three. The chief researcher is the author. The subject of this chapter is the role of an instructor 13 using PASF to design exercises and to check stu- dent comprehension of step-wise refinement and/or flowcharting techniques. . 4. 1 Design of Exercises The design of the exercises by an instructor is an in- tegral part of PASF. The author feels that if an educational tool such as PASF ever achieves success by being used by instructors in computer science courses, it must allow the instructors to design their own exercises. Since each instructor runs his/her courses slightly differently, he/she will want to incorporate different exercises. For example, if a "canned" set of exercises were de- signed by the author, then they would be used only by the author. Furthermore, allowing the instructor to change the exercises often, e.g., each term, will discourage the students from cheating. PASF 13 The instructor's tasks are fully described in "The Instructor's Manual for PASF" [Hyde, 1975]. 103 has been designed to handle, within a certain programming domain, any exercise dreamed up by an instructor. 4.1.1 Importance of an Instructor's Awareness of the Scope of the Programming Domain Permitted by PASF The instructor, e.g., professor or teaching assistant in a CSlOO-level course, must be conscious of the limitations of PASF. Obviously, PASF will not handle a large program, e.g., a compiler. The instructor must realize that PASF is meant to be an educational tool to teach introductory nonmajor programmers in their frustrat- ing first weeks of a programming course. The instructor must be aware that the exercises must be designed within the programming domain of PASF, i.e., the simple language SIM in up to twenty-one flowchart boxes. 4.1.2 Design of an Exercise Off-Line The design of an exercise by an instructor involves both off-line and on-line tasks. The off-line task is writing the word problem. The instructor must be aware of several limitations in writing the word problem. First, the wording must suggest to the student a smooth flowchart. Secondly, the word problem should clearly specify the flowchart to be drawn by the student. The instructor should not leave pieces of the flowchart, e.g., the input and output, unspecified. Thirdly, the word problem 104 must suggest only one algorithm, not a class of algorithms (e.g., bubble sort vs. sort) to the student. These last two restrictions arise because PASF assumes that the instructor's flowchart is Global Semantic Equivalent (GSE) (cf. Section 2.4) to the word problem. If the word problem is vague and unspecified, the in- structor's flowchart cannot be GSE to it. These limitations on the instructor are not serious, but do require that he/she design the exercises carefully. 4.1.3 Design of an Exercise On-Line After designing the word problem off-line, the instruc- tor sits at a PLATO IV terminal to finish the design on-line. By his/her sign-on name, PASF knows that the person sitting at the terminal is an instructor. PASF gives him/her special instructions and allows options not available to a student. These instructions relate the four steps necessary to design an algorithm representing a word problem: 1. Design the acceptable English phrases; 2. Type in the name of the exercise; 3. "Draw" the correct smooth flowchart; and 4. Answer a series of questions. Each of these four steps will be covered individually in the follow- ing sections. 105 4.1.4 English Phrases in Flowchart Boxes If the instructor wants students to type English phrases in flowchart boxes, he/she must design the acceptable English phrases. For each English phrase, the instructor must insert a TUTOR "ANSWER" command or groups of ANSWER commands separated by "OR" commands at the appropriate place in PASF's code. 11 * The special instructions given by PASF tell the instructor where to in- sert the ANSWER commands. Since the maximum number of different English phrases per exercise is four, the instructor inserts one, two, three, or four ANSWER commands, depending on the number of different English phrases he/she wants the student to input. The ANSWER command below Command Tag ANSWER dishes (done, finished, completed) matches any of the following student responses: "dishes done" "all the dishes are finished?" "are all the dishes completed?" For those unfamiliar with TUTOR, the ANSWER command matches the words in its "tag" to the words in a student response. The words in the tag inside parentheses are synonyms; words inside of "<" and ">" are possible extra words. All the other words in the tag are 1<+ "The Instructor's Manual for PASF" [Hyde, 1975] describes the use of English phrases more thoroughly. 106 required for a match of a student response. The same order of synonyms and important words is also required. The insertion of ANSWER commands in PASF's code causes several logistic problems. First, the instructor must exit from PASF and edit the code. Second, the code of PASF must be recon- densed (recompiled) before the instructor can reenter PASF. The first logistic problem limits the people who may be an instructor. He/she must be able to edit a TUTOR file and insert the ANSWER commands. Since a recondense of a program will "kick out" (i.e., remove from program) any students currently running in that pro- gram, the second logistic problem requires that the instructor do his/her adding of exercises at hours when students are not using PASF. Together the two problems mentioned above force the stipula- tion that only one instructor can input an exercise at any one time. These logistic problems are not too serious, for it is ex- pected that exercises will be inputted, ewery week, not es/ery hour. The insertions of ANSWER commands in PASF to handle English phrases is not a perfect solution, but it is acceptable. 4.1.5 The Name of the Exercise The instructor types the name (up to ten characters) of each exercise. Since this name is used by PASF to distinguish the different exercises, the same name must appear on the paper handout given to the student. 107 4.1.6 The Correct Smooth Flowchart The instructor inputs a correct solution in the form of a smooth flowchart. After inserting ANSWER commands (if there are English phrases) and giving a name to the exercise, the instructor "draws," in a way similar to the student (cf. Section 3.5), the correct flowchart. When the instructor finishes drawing his flow- chart, he/she asks for PASF to check it. Like the student's flow- chart, the instructor's flowchart must be a legal flowchart, must be a smooth flowchart, and must pass all the heuristic checks. Since much of PASF's success in analyzing the student's flowcharts is dependent upon the instructor's flowchart, the in- structor must be conscious of several facts before "drawing" his/ her flowchart. First, the instructor's flowchart must correctly reflect the word problem. Second, it should be the typical solution expected from all the students. An esoteric solution from an in- structor may mean the failure of PASF to handle the students' cor- rect solutions. The instructor must realize that it is his/her responsibility to "draw" a flowchart which increases the success of PASF in deducing the students' correct solutions. 4.1.7 Answers to a Series of Questions For the algorithm specific checks and the deduction scheme performed on the student's flowchart (cf. Sections 3.6.7 and 3.7), the instructor answers about a dozen questions, which are quickly 108 and easily answered, about his/her flowchart on the screen. Ex- amples of the questions used by the algorithm-specific checks are, "What is the minimum number of 'read's for this exercise?"; "What is the minimum number of loops for this exercise?"; and "Is a 'read' inside a loop?" Other answers inputted by the instructor are used by the deduction scheme. An example of a question the answer to which is used by the deduction scheme is, "Is the order of 'read's important?" If the instructor demands tjiat the "read"s ("prinf's and English phrases are similar) must be in order, the deduction scheme is altered when a student's flowchart is checked. The alteration in- cludes a special output variable added to each "read" in the Stand- ard Regular Expression (SRE) representations. This new output var- iable (different from any variable a student could use) changes the performance of TRANSLATOR B and TRANSLATOR R. The two trans- lators will not be allowed to interchange two "read"s because of the same output variable (cf. Condition 3 of Section 3.7.7). After the instructor answers the questions, PASF checks his/her flowchart for consistency with his/her answers to the ques- tions. If there is an inconsistency, e.g., minimum of two "prinf's required but only one "print" in the instructor's inputted flow- chart, PASF requires him/her to reanswer the questions. If there are no inconsistencies, PASF stores the instructor's exercise and the instructor is asked whether he/she would like to input another exercise. 109 4.1.8 Storage of Instructor's Flowchart and Data As soon as the instructor's answer to the last question is accepted by PASF, the flowchart and other data are stored and ready for use by the students. Thirty words of storage is needed to store each of the instructor's exercises. Up to thirty exer- cises are stored and accessible to all the students as read only memory (TUTOR COMMON), ^ery little storage (thirty words) is re- quired to store everything needed for an exercise. 4.1.9 Summary of Inputting an Exercise To input an exercise, the instructor must insert the ANSWER commands, type the name, "draw" a correct smooth flowchart, and answer about a dozen questions. After a little practice, an instructor sitting at the terminal can input an exercise in less than ten minutes. Writing up the word problem will take the in- structor a little longer, but since a typical exercise is less than a full typed page, the time is not excessive. The instructor does not need to know the authoring language TUTOR or the PLATO system yery well. If an instructor does not use English phrases, then he/ she needs no knowledge of TUTOR. PASF has been written to allow easy and quick inputting of exercises without hindering the crea- tivity of the instructor. no 4.2 Data, Collected by PASF, Allowing Instructor to Check Student Comprehension PASF collects data which allows an instructor to check the comprehension and progress of his/her students. The data in- cludes flowcharts "drawn" by the students, the time the student takes to complete a section, and unanticipated student responses. PASF collects data which is sufficient to recreate a student's flowchart. A copy of every flowchart that a student checks is placed into a special file (PLATO data file). PASF transfers sixty-six words per copy of flowchart. (This allows up to 65 copies of flowcharts in a PLATO single part data file.) At his/her leisure, e.g., a week later, the instructor can retrieve the data and recreate the student's screen at the state when the student asked for his/her flowchart to be checked by PASF. For the instructor's convenience, the name of the student and the number of the flowchart in the data file are displayed along with the student's copy of the flowchart. In Fig. 3-6 of Chapter Three is a copy of "maryjane's" attempt at algorithm "intdivide," displayed several days after she drew the flowchart. The "3" in the upper right hand corner in Fig. 3-6 signifies that this is the third flowchart in the data file. It should be noted that the array of boxes and arrows in Fig. 3-6 is identical to the flowchart maryjane saw, and not a topological transformation of her flowchart. Not only can the instructor see a copy of the student's flowchart, but he/ in she can also "check" the flowchart and observe the error message, if any, PASF gave to the student, and can modify the copy. These capabilities of seeing, checking, and modifying a copy of a student's flowchart allow the instructor to judge the abilities of his/her students. In the data file PASF collects unanticipated student re- sponses which were not matched by PASF. Of interest to an instruc- tor are the unmatched student attempts at the English phrases. After the first few students have done an exercise, the instructor reads the unmatched student attempts in the data file concerning his/her English phrases and revises his/her ANSWER commands (cf. Section 4.1.4). This refinement resulting from unaccepted student responses is important in any computer-based educational program. It is clear that the data collected by PASF allows an instructor to see, check, and modify a student's flowchart. These features, along with the record of the times for the student to finish each section, allow an instructor to monitor the performance and progress of his/her students. Data is collected to allow an instructor to increase the range of accepted student versions of English phrases. The data collection features of PASF are an im- portant consideration in the acceptance of PASF as a viable computer-based educational program. 112 Chapter Five DETAILS OF FIVE ALGORITHMS IN PASF 5.1 Introduction This chapter is a collection of PASF's algorithms, the details of which are not covered elsewhere in the thesis. To allow the discussions in Chapter Three to be readable and not burdened by details, these algorithms have been assembled in this chapter. Chapter Five discusses five algorithms: 1. The Connector Algorithm which draws the lines be- tween boxes, 2. The algorithm to check if a legal flowchart, 3. The algorithm to check if a flowchart is smooth, 4. The algorithm which generates the Regular Expression (RE) Representation of a flowchart, and 5. The algorithm to check if two strings in the Standard Regular Expression (SRE) representation can be inter- changed. 5.2 Connector Algorithm When a student wants an arrow drawn between two boxes on the screen, he/she touches the "from" and "to" boxes and PASF auto- matically computes the route and draws the line. Since there may be twenty-one boxes on the screen and the line can go from any one 113 box to any other box, the routing algorithm is much more involved to write than the author originally thought. Certain cases fail for most hastily designed approaches. The difficulty of the problem can be comprehended if one considers the boxes as a twenty-one node graph. Fortunately, this graph is not a complete graph, since only a maximum of two arrows may leave a box (diamond). The worst case consists of nineteen diamonds and two ovals in which forty lines must be routed (19 x 2+2). Considering the physical limitation of the screen (512 dots by 512 dots and 8.5" by 8.5") and the twenty-one boxes (a box is 32 dots by 128 dots) already taking a large portion of the screen, one finds it difficult when drawing lines not to draw one line on top of another. The example below shows one line on top of another. Fig. 5-1. The Problem of Two Lines on Top of Each Other 114 A further problem of the designer is to make the connec- tors appear natural. Not only must all the lines be distinguish- able (e.g., not on top of each other), but they must also be close to the way a human programmer would draw them. In Fig. 5-2, the example on the left more nearly illustrates this way. Natural Unnatural Fig. 5-2. Two Correct Ways to Draw a Line between Two Boxes Since a student may draw the boxes in any order, the Con- nector Algorithm must allow for boxes to be added later. Therefore, the Connector Algorithm never draws a line through a box or through a place where a box may appear. Fig. 5-3. Lines Must Never Travel Through a Box 115 The Connector Algorithm has been rewritten several times in order to improve the distinguishability and naturalness. The current version of the algorithm follows. The "from" box is designated by coordinates (x, y). The "to" box is designated by coordinates (w, z). Since there are 21 specific positions of three columns of seven, x and w range from 1 to 3, and y and z range from 1 to 7. Below is shown the coordi- nate system for the Connector Algorithm. / 6 x,y "from" 32 dots 5 - 128 dots 4 ■ 3 - w ,z ? . "to" 1 - 1 1 i- Fig. 5-4. X, Y Coordinate System for the Connector Algorithm The positions of the two boxes fall into five cases. Case one is tested first. If case one fails, case two is tested, 116 etc. If a case applies, the line is drawn as shown. In Figs. 5-5 through 5-9, some lines are labeled with their lengths in dots and some lines are functions of "y" or "w" dots. Case 1 : x = z and x + 1 = w x,y w,z Fig. 5-5. Case 1 of Connector Algorithm Case 2 : x = w and y = z + 1 x,y w,z I Fig. 5-6. Case 2 of Connector Algorithm Case 3: abs (x - w) = 1 and y = z + 1 ,z X >y w T 12 dots Fig. 5-7. Case 3 of Connector Algorithm 117 Case 4: y - z x,y 2 • y dots JL I 6 dots w,z Fig. 5-8. Case 4 of Connector Algorithm Case 5 : y < z 3 + 3 • w w,z 4p~ dots f — ± 3 + 3 • w dots x,y 2 • y dots 6 dots Fig. 5-9. Case 5 of Connector Algorithm 118 It should be noted that cases 1, 2, and 3 follow the flowchart convention that boxes should tend to flow from the top to the bottom and from the left to the right. The case in Fig. 5-10 is not allowed, w,z x,y Fig. 5-10. Not Allowed in PASF since the situation in Fig. 5-11 with two, arrows lying on top of each other arises. Fig. 5-11. An Undesirable Situation of Two Lines on Top of Each Other in a DOWHILE The undesirable example above would be drawn by PASF as follows: Fig. 5-12. The Example of Fig. 5-11 Drawn by PASF 119 The five cases illustrated above are not sufficient to enable PASF to draw the lines. The diamonds must be handled in a special way. Since two arrows are drawn from a diamond, PASF draws the first by the above five cases and draws the second line from the right side (Case 6) if the first line is drawn from the bottom. The "T" and the "F" are plotted near the appropriate line. Case 6 : 3 + 2 • y dots F 3 + 3 • w dots Second line First line T a,b 1 1 3 + 3 • a x dots Fig. 5-13. The Second Arrow of the Diamond is Drawn to the Right as Shown if First Arrow is Drawn from Bottom The lines drawn by the Connector Algorithm are shifted a few dots to allow them to be distinguishable. There are 31 dots between adjacent horizontal or vertical boxes. 120 31 dots 31 dots Fig. 5-14. Dots between Adjacent Boxes The vertical case in which horizontal lines must be drawn is divided into two regions: 1 dot space 1 dot space 14 dots 11 dots 4 dots for arrow head Leaving arrows Entering arrows Fig. 5-15. Regions for Horizontal Lines between Two Adjacent Boxes 121 There are only two ways for "leaving arrows": 1. Case 3, which uses 12 dots from the bottom of the box, and 2. Cases 4 and 5, which use 6 dots from the bottom of the bottom of the box. The three different "entering arrows" of the horizontal line are displaced by a function of the X coordinate of the "to" box (in Cases 4, 5, and 6). If the X coordinate is 1, 2, or 3, the "enter- ing arrow" horizontal line is 6, 9, or 12 dots above the "to" box respectively. To displace the horizontal lines of the entering arrow, the lines are a function (3 + 3 • X) of the X coordinate of the entering box. The diagram below demonstrates this displacement. 3,4 1,3 6 dots 1,1 T 3,1 jL T 12 dots Fig. 5-16. Example to Demonstrate Displacement of "Entering Arrows" 122 The regions for vertical lines are divided into arrows used by Case 5 and arrows used by Case 4, as shown below. 1 dot space 1 dot space 14 dots , 14 dots 1 dot space Case 4 arrows Case 5 arrows v Fig. 5-17. The Regions of Vertical Lines The function 2 • y where y = 1, 2, 3, 4, 5, 6, or 7 dis- places the seven possible case 4 arrows for any column. This allows one space between each of the seven lines. The six possible vertical Case 5 arrows (for any columns) are handled by the function 2 • y in a way similar to the way Case 4 arrows are handled. The Case 6 vertical arrows are inserted between the Case 4 arrows by the function 3 + 2 • y. The above routing scheme guarantees that no line lies on top of another line. The worst case is a vertical line lying adjacent to another vertical line (e.g., Case 4 arrow from rectan- gle at (7, 1) and Case 6 arrow from diamond at (6, 1)). The Connector Algorithm is a feat of engineering re- quiring careful routing of the lines to keep them distinguishable 123 14 dots 12 dots 10 dots 8 dots Fig. 5-18. Example to Show Displacement of Case 4 Vertical Arrows from each other. The Connector Algorithm sacrifices slightly on naturalness, but is a viable method of drawing the arrows between the flowchart boxes. 5.3 Algorithm to Check the Legality of a Flowchart This section discusses the algorithm which will check to 124 see whether a student's array of arrows and boxes is a legal flow- chart. A legal flowchart is defined as one (a) which has one "start," (b) which has at least one "stop," and (c) in which every box can be reached by traversing the array of boxes and arrows beginning at "start." A further qualification of a legal flowchart is that (d) text must be inside e\/ery box (cf. Section 3.6.1). The first part of the algorithm checks to see whether the student's flowchart ha& one and only one "start" and at least one "stop." A "start" is recognized as a node of "oval" type which has a link leaving, no links entering, and the text "start" inside. A "stop" is recognized as a node of "oval" type which has no links leaving and the text "stop" inside. The next portion of the algo- rithm checks that all nodes of "rectangle" type have a right link (RL) and that all nodes of "diamond" type have both a right link (RL) and a left link (LL). With the above two portions completed, PASF begins the portion of the algorithm which traverses the flow- chart. The main portion of the algorithm traverses the list structure representation, marking every node visited. It takes the "T" side of the diamonds first and pushes into the stack any dia- mond found. When the algorithm hits a marked node or a "stop," the algorithm pops the stack and then takes the "F" side of the popped diamond. A one-bit flag in each node is used as the mark. 125 If a box can be reached from "start," the traverse will mark the node. This portion of the algorithm is well-known and easily de- rived from Knuth [Knuth, 1973]. Figure 5-19 presents the flowchart of the traversal portion of the algorithm. After the traversal portion is completed, all used nodes must be marked, otherwise the array of arrows and boxes is not a flowchart. A quick check reveals any unmarked used nodes and an appropriate error message is given to the student. The last portion of the algorithm tests for text inside eyery used node. If the student's array of arrows and boxes fails in any portion of the legal flowchart checker, the student is given an appropriate message and required to fix his/her flowchart. 5.4 Algorithm to Check Smoothness of a Flowchart 5.4.1 Introduction The Smooth Checker is a recursive algorithm based on the web grammar W« (cf. Section 2.2.4) which checks whether a flowchart is a smooth flowchart or not. For the reader's reference, the six rewrite rules for the web grammar W 2 are repeated below. P, b, d, and j are abbreviations for Process, box (rectangle), diamond, and join (meeting of two arrows), respectively. Rewrite rules Rl and R2 allow a series of rectangles in a smooth flowchart; R3 and R4 allow IFTHENELSEs; R5 and R6 allow the DOWHILE. The main thought 126 start Push I zero all marks, zero stack I I «- start node, mark I I +■ RL(I) $next node or $T side of diamond mark I P «- pop I + LL(P) $F side of $ diamond Fig. 5-19. Traversal Portion of Flowchart Checker ($ Signifies a Comment) 127 web, I = { s ,P , e} ; V = {P} ; V = {s, e, b, d, g} Rl: P: = P t> R2: P: = b R3: P: = P R5: P: = P R6: P: = Fig. 5-20. Six Rewrite Rules of Web Grammar kL underlying the algorithm is to attempt to find the processes (P's) by looking for structures in the list structure representation which correspond to the six rewrite rules. Finding a P for a series of rectangles (Rl and R2) is fairly easy. Finding a P for an IFTHENELSE requires starting at a diamond (d); finding a P on the "T" side of IFTHENELSE upon finding a j; finding another P on the "F" side of IFTHENELSE upon finding the same j. Since a DOWHILE can loop on either "T" or "F," there are two possible cases for a P for a DOWHILE. Finding a P for a DOWHILE requires starting at a 128 diamond (d), finding a P on the looping side ("T" or "F"), and finding a j which is directly in "front" of the diamond (d). The subsections of Section 5.4 will discuss the algorithm which re- cursively searches for concatenations of processes, IFTHENELSEs, and DOWHILEs in the student-generated flowchart. 5.4.2 Proble ms Due to Differences in the Web Grammar and List Structure Representations of Flowcharts Since there are differences between the web grammar and v the list structure representations of a flowchart, an algorithm which operates on the list structure should not simulate exactly the six rewrite rules of the web grammar. The differences between the web grammar and the list structure representations are two: 1. A null rectangle can exist in the list structure re- presentation; and 2. Adjacent j's in the web grammar exist as only one j in the list structure. As shown below, the first difference of a null rectangle can exist in a smooth flowchart as well. Fig. 5-21. A Null Rectangle on the F-Side of an IFTHENELSE 129 The occurrence of null rectangles causes no difficulty with the Smooth Checker. Figure 5-23 shows a flowchart with its web gram- mar representation and list structure representation to demonstrate the possible difference in the number of j's. In the list struc- ture, a j is any node with more than one link pointing to it. A one- bit field is set in the node to designate that the node is a j. (In the list structure below and on the following pages, a node in the list structure has the following fields: . mark TYPE r = rectangle d = diamond case 1 2 o = oval LL r RL link of oval, rectangle, or "T" of diamond F link of "F" of diamond A is null link Fig. 5-22. The Fields in a Node in the List Structure where the possible values of the fields are shown.) Since all four links point to the "oval" node, the three j's in the above web grammar are no longer distinct in the list structure. The j's do cause complications in the Smooth Checker. The list structure always has a j where it is necessary, but the j may be required to be used many times. 5.4.3 Marking the J's in the List Structure A j in the web grammar is associated with a one-bit field 130 Web Grammer of Flowchart A List Structure of Flowchart A Fig. 5-23. Smooth Flowchart A with Embedded IFTHENELSE and Its Web Grammar and List Structure Representations 131 in some node of the list structure. If any node has more than one link pointing to it, the node is a j and the j field is set. The setting of the j field is accomplished with very little extra computation during the execution of the Flowchart Checker (cf. Section 5.3). The Flowchart Checker which checks whether the student's array of arrows and boxes is a flowchart sets the j field. A portion of the Flowchart Checker traverses the list structure mark- ing each node it visits. If it visits a node already marked, the Flowchart Checker pops a stack and continues on a different link (LL) of a diamond node. Every node that is visited and already marked must be a j, since the Flowchart Checker reached that node by more than one different path. The Flowchart Checker sets the j flag whenever it visits a node that is already marked. 5.4.4 Brief Description of the Smooth Checker Before the Smooth Checker begins, it is known that the list structure is a legal flowchart and the j fields have been set, The Smooth Checker is divided into two routines, SRI and SR. SRI is the main routine which calls SR e\/ery time a diamond node at the zero level (an unembedded IFTHENELSE or DOWHILE) is found. Routine SR calls itself recursively after finding an embedded dia- mond node. SR first tries to find whether a diamond is part of an IFTHENELSE. If the diamond cannot be an IFTHENELSE, SR tries to 132 show that the diamond is part of a DOWHILE which loops on "T." If the diamond is not a DOWHILE which loops on "T," SR tries to show that the diamond is a DOWHILE which loops on "F. " If this previous case fails, the flowchart cannot be smooth and an appropriate mes- sage is given to the student. The algorithm for the Smooth Checker will work on an arbitrary flowchart with arbitrary embedding of DOWHILEs and IFTHENELSEs. PASF's implementation is limited to a depth of three embedded DOWHILEs and IFTHENELSEs, e.g., a DOWHILE inside an IFTHENELSE inside a DOWHILE inside an IFTHENELSE. PASF's limita- tion of a depth of three embedded constructs is because of a stack memory limit. This limitation is not restrictive, for PASF only allows twenty-one boxes. When SR is trying to show that a diamond is an IFTHENELSE, SR traverses the "T" side and finds a j. SR places a pointer at the j and traverses the "F" side of the diamond until SR finds a j and sets a pointer. If the first j is the same node as the second j, SR has found an IFTHENELSE and returns to where it was called. Either j, because of embedded IFTHENELSEs and DOWHILEs, may not be the correct j for the diamond. SR systematically moves across the embedded diamonds the two pointers pointing at the two j's until the two pointers point at the same node. If all the possible ways for the two j's are tried and have all failed, SR tries to show that the diamond is part of a DOWHILE. 133 For a DOWHILE, SR traverses the side which loops (first the "T" side, then the "F" side) and finds a j. If this j is the same node as the diamond, then SR has found a DOWHILE and returns to where it was called. If the j is not the same node as the diamond, the j is assumed to be a part of an embedded DOWHILE or IFTHENELSE. SR continues on to the next j and checks to see whether the node is the diamond. If the j fails to be the diamond, SR tries the next case (e.g., DOWHILE which loops on "F"). If the DOWHILE which loops on "F" fails, SR declares the flowchart unsmooth and gives the student the appropriate message. Every time SR encounters an embedded diamond, the ap- propriate pointers and lists are pushed into the stack and SR is called recursively. Upon finding that the embedded diamond is smooth, SR returns the call, pops the stack, and continues. The next section will cover the details of the main routine SRI. The details of the recursive routine SR follows in Section 5.4.6. A discussion of the problems involved in always having the Smooth Checker terminate appears in the last section of 5.4. 5.4.5 Details of SRI Routine Figures 5-24 and 5-25 show the flowchart for the routine SRI. The discussion in this section as well as in the next section will require the reader to glance often at these flowcharts. 134 start zero stack N + zero case and mark of all diamond nodes i I *■ start node 2 I +■ RL(I) 01 <- DI «- I $must be a $diamond $smooth case (I) ■*- 1 $must be $a loop call SR Fig. 5-24. Part I of Flowchart for Routine SRI 135 I *- JI $IFTHENELSE $found I - LL(I) $T DOWHILE $found I *- RL(I) $F DOWHILE $found Fig. 5-25. Part II of Flowchart of Routine SRI 136 In the routine SRI (cf. Fig. 5-24), the stack and N are zeroed and the "case" and "mark" fields in every diamond node are zeroed. To be smooth, a diamond must be part of an IFTHENELSE, a DOWHILE which loops on "T," or a DOWHILE which loops on "F." Routine SR, which is called by SRI, assumes a diamond is part of an IFTHENELSE (case = 0) first. If SR shows that this diamond cannot be part of an IFTHENELSE, then SR tries a DOWHILE which loops on "T" (case =1). If case = 1 fails, then a DOWHILE which loops on "F" (case = v 2) is tried. Routine SRI (cf. Fig. 5-24) begins by setting I (pointer of the current node with a value from 1 to 21) to the "start" node and visits the next node (I + RL(I)). If the next node is a rec- tangle, a special check (to be discussed later) is performed to catch a class of errors, and the next node is visited. If the next node is not a rectangle and is an oval, the SRI is finished and the flowchart is smooth. Assuming the next node is not a rectan- gle or an oval, it must be a diamond. JI (pointer to a node which is a j) is set to zero and DI (a pointer which points at the cur- rent diamond node) is set to I. If I is a j and I f N, then this diamond cannot be an IFTHENELSE (the reason for this will be dis- cussed later) and case (I) is set to one. SRI calls SR. If SR returns to SRI, SR has succeeded in showing that the diamond DI is a DOWHILE or an IFTHENELSE and has set JI and the "case" of DI. Below are the three flowchart segments on which SR will succeed (assuming "a," "b," "c," and "e" are smooth). 137 case = case = 1 case = 2 Fig. 5-26. Three Cases in which SR Will Succeed Below are the list structure representations of these three cases DI JI DI JI=3d case = case = 1 case = 2 Fig. 5-27. List Structure Representations for the Cases of Fig, 5-26 138 The fields in a node are again shown below. TYPE J FLAG CASE MARK LL RL Fig. 5-28. Fields in a Node After the return by SR, routine SRI marks DI "finished" and continues by visiting the next node. If case = 0, then the next node is the one pointed to by JI. If case = 1, then the next node is the one pointed to by the left link (LL) of the diamond. If case = 2, then the next node is the one pointed to by the right link (RL) of the diamond. Routine SRI continues until it finds an oval node. The two discussions deferred above involve the pointer N. N is equal to the last JI. In the first check, node I is a rectangle. If I is a j and I = N, then the j is the result of an IFTHENELSE SR found directly "before" the rectangle and terminated on the rectangle (cf. Fig. 5-29). V Fig. 5-29. A Rectangle is a J Because of the IFTHENELSE "Before' 139 The above situation is smooth and no error occurs. If I is a j and I f N, then the j must be the result of an arrow of a bad loop. The arrow cannot be from an IFTHENELSE, for then I would equal N. Fig. 5-30. A Bad Loop, Since the Rectangle is a J and I f N The flowchart segment above cannot be a DOWHILE, since the j cor- responding to a DOWHILE is always a diamond node. Therefore, it must be a bad loop. A message is given to the student saying the flowchart is unsmooth because the arrow pointing to the rectangle causes a bad loop. The second check involves a diamond which is a j. As with the first check, the j might be a result of an IFTHENELSE SR found directly "before." If an IFTHENELSE is "before," then I = N. If I f N, then an IFTHENELSE is not "before," and the diamond must be part of a DOWHILE loop or part of something unsmooth. To 140 Fig. 5-31. A Diamond which is a J Might be an IFTHENELSE if I = N communicate to SR the message not to try the case of an IFTHENELSE. SRI sets the "case" of DI equal to one. Routine SRI handles all the level-0 rectangles and calls SR for every level-0 diamond. If SRI finds an oval, SRI halts and tells the student his/her flowchart is smooth. 5.4.6 Details of the Routine SR Routine SR, which is called by SRI and itself, tries to show that a diamond is part of an IFTHENELSE, a DOWHILE which loops on "T," or a DOWHILE which loops on "F." These three cases are tried in the above order. If a case fails, the next case is tried. 141 If none of the three cases succeed, SR halts and tells the student that his/her flowchart is unsmooth and points out the offending diamond. In certain situations, SR will determine that the flow- chart is unsmooth and will not try the next case. The routine SR is shown as a flowchart spread over five figures (Fig. 5-32 through Fig. 5-36). Since all three cases are shown together, the routine is complex. Part of Fig. 5-32 shows an outer loop which increments case (DI) by one and tries the complete routine again. If case (DI) = 3, then all three cases have been attempted and the student's flowchart is unsmooth. If a new case is tried, JI is set to zero, the diamonds in TLIST and FLIST (marked while trying the old case) are unmarked, TLIST and FLIST are zeroed, LASTTJ and LASTFJ are set to zero, and a new I is selected, depending on case (DI). DI points to the diamond node that SR is trying to show as smooth or not smooth. I points to the current node. JI points to a j. TLIST and FLIST are two lists of all the diamonds that are marked while attempting to show that DI is smooth. For an IFTHENELSE, TLIST has all the diamonds marked on the "T" side of the IFTHENELSE, and FLIST has the diamonds marked on the "F" side. For DOWHILEs, all the diamonds marked are in TLIST. For DOWHILEs FLIST is not used. LASTTJ points to the node of the j of the last embedded diamond on the "T" side of an IFTHENELSE. LASTFJ points to the node of the j of the last embedded diamond on the "F" side of an 142 start zero LASTJ, LASTFJ I — ^ I *■ LL(DI) I *■ RL(I) unmark in FLIST, zero FLIST I «■ JI JI *• case (DI) incremented by 1 $Diamond case 1 , 2 $case failed $try next case JI «- unmark in FLIST, TLIST zero FLIST, TLIST Fig. 5-32. Part I of Flowchart of Routine SR 143 N $have hit a j (node I) 3 )$IFTHENELSE JI <- I I «■ LL(DI) $Finished T side $of IFTHENELSE return $Found an IFTHENELSE $case failed T LASTFJ >— ^/ 2 $j on F side is not j for DI $ found a diamond I <- JI JI <- unmark FLIST zero FLIST Fig. 5-33. $j on T side is $not j for DI $case failed Part II of the Flowchart of Routine SR 144 $have hit a j (node I) $D0WHILE JI <- I -< return $found a DOWHILE $the j is due to an $embedded IFTHENELSE $ found a diamond $case failed Fig. 5-34. Part III of the Flowchart of Routine SR 145 $found a diamond 6 ) $case 1 or 2 Push (DI.JI, LASTTJ, LASTFJ TLIST, FLIST) JI + case (I) DI *■ I call SR I «- RL(I) I - LL(I) Pop (DI,JI, LASTTJ, LASTFJ, TLIST, FLIST) LASTTJ *■ P add QX to TLIST LASTFJ 4- P add QX to FLIST *© Fig. 5-35. Part IV of the Flowchart of the Routine SR 146 $found a diamond 5 ) $case 0, 1, or 2 Push (DI, JI, LASTTJ, LASTFJ, TLIST, FLIST) JI <- case (I) <■ DI + I ♦ / \ ( call SR ) * I «- DI mark DI QX + I I -LL(I) Pop (DI, JI, LASTTJ, LASTFJ, TLIST, FLIST) P <■ JI I <- JI LASTTJ <- P add QX to TLIST LASTFJ «- P dd QX to FLIST Fig. 5-36. Part V of the Flowchart of the Routine SR 147 IFTHENELSE. For DOWHILEs, LASTTJ points to the j of the last em- bedded diamond. For DOWHILEs, LASTFJ is not used. The above pointers DI , JI, LASTTJ, LASTFJ, and the two lists TLIST and FLIST are what is required to be pushed into a stack whenever SR finds a new diamond. To continue the discussion of Fig. 5-32: routine SR checks whether I is a j. If I is a j, then, depending on the case of DI, SR goes to (5) of Fig. 5-33 or (?) of Fig. 5-34 to see whether SR has found an IFTHENELSE or a DOWHILE respectively. SR's goal is to find a j to finish an IFTHENELSE or a DOWHILE. SR may be detoured by finding a j that is not the correct one for DI. If the node I (cf. Fig. 5-32) is not a j , SR checks whether I is a rectangle. If I is a rectangle, then SR continues with the next node. If I is not a rectangle, SR checks to see whether I is an oval. The discussion of I as an oval is deferred to later. If I is not an oval, it must be a diamond. If the case (DI) = and DI = I, then SR has found a loop. The IFTHENELSE case fails and case (DI) is incremented by one. If the check is false and I is a diamond, SR goes to(5)of Fig. 5-36, pushing the current DI into the stack and calling SR to check whether the new diamond is smooth. IFTHENELSE (case = 0) In Fig. 5-33, SR assumes that the diamond DI and the found j are part of an IFTHENELSE. Below is shown a flowchart 148 segment with an IFTHENELSE with no embedded diamonds and the list structure representation of the flowchart segment. DI — > r A i Fig. 5-37. Flowchart Segment of an IFTHENELSE and Its List Structure Representation With JI initially zero, SR takes the "T" side of the diamond (RL (DI)) and traverses the list structure, moving I until SR finds a j. When I points to j, JI is set to I, the "F" side of the dia- mond is taken (I «- LL (DI)), and SR tests whether the new node is a j ((T) Fig. 5-32). When another j is hit and case = 0, SR is again at © of Fig. 5-33. Since JI has been set to the j found by traversing the "T" side of the diamond DI, JI is not equal to zero. If I = JI, i.e., the newly found j of the "F" side is the 149 same node as the j of the "T" side, and I f DI , an IFTHENELSE has been found. Routine SR returns to where it was called. DI points to the diamond of the IFTHENELSE, case (DI) is zero, and JI points to the j of the IFTHENELSE. If I = JI and I = DI , the following situation is true: Fig. 5-38. Unsmooth Flowchart Segment When I = JI and I = DI The flowchart segment above can never be smooth. In the web grammar, one and only one j is allowed for every DOWHILE. Since there are three arrows entering DI, there must be two j's in the web grammar. Therefore, to be smooth, the above must have at least two arrows from an IFTHENELSE. Since two arrows are branch- ing back, they cannot be from an IFTHENELSE. The contradiction proves the above is unsmooth. The student is given a message say- ing that there is a bad loop with both the "T" side and the "F" side eventually returning to the diamond. If both (I = JI and I f DI) and (I = JI and I = DI) are 150 false, then this logically implies I f JI. If I does not equal JI, this means that the j on the "T" side and the j on the "F" side are not the same node. Several situations may arise in which the diamond can still be smooth. If I = DI or JI = DI , then SR may have been trying to find an IFTHENELSE which is really a DOWHILE, as shown below. I ^ JI I = DI JI = DI Fig. 5-39. Possible DOWHILEs if I = DI or JI = DI In the above situations, SR (cf. (7) of Fig. 5-32) tries the next case by incrementing the case of DI by one. If there are embedded DOWHILEs or IFTHENELSEs inside an IFTHENELSE, the condition J I f I could mean that SR found the wrong j for DI. Below is shown a flowchart segment with its list struc- ture which has embedded diamonds in an IFTHENELSE. The nodes have been numbered for reference in the text. It should be noted that 151 Fig. 5-40. A Flowchart Segment with IFTHENELSE with Embedded Diamonds there are four diamonds and only two j's in the list structure, and that the above flowchart is smooth. The analysis is started at the top of Fig. 5-32. The next node is the "T" side of DI (I *- RL(DI)). This I (node 152 numbered 2) is neither a j nor a rectangle nor an oval. I is a diamond and SR goes to (?) of Fig. 5-36. The current DI, etc., are pushed and SR is called. For the moment, it will be assumed that SR finds the smaller IFTHENELSE (nodes 2, 4, 5, and 8). I points to node 8 (cf. Fig. 5-36). The stack is popped. DI is again node 1. LASTTJ points to the JI of the smaller IFTHENELSE (node 8). At (T) of Fig. 5-32, I is a j and case (DI) is zero. At (3) of Fig. 5-33, since JI = 0, JI is set to I (node 8) and the "F" side of DI is the next node. This node I (node 3) is neither a j nor a rectangle nor an oval. I is a diamond and SR goes to (5) of Fig. 5-36. Again, the current DI, etc., are pushed and SR is called. For the moment, it will be assumed that SR finds the smaller IFTHENELSE (nodes 3, 6, 7, and 10). At the bottom of Fig. 5-36, I points to the JI of the smaller IFTHENELSE (node 10). The stack is popped. DI is again node 1. Since J I f 0, LASTFJ points to the JI (node 10) of the smaller IFTHENELSE. SR goes to Q of Fig. 5-32. I (node 10) is a j. SR goes to (3) of Fig. 5-33, since case (DI) =0. JI f and JI f I. (JI points to node 8.) Since I = LASTFJ, SR goes to (2) of Fig. 5-32 and asks whether I is a rec- tangle. After many computational steps, SR has discovered that the j on the "F" side of the IFTHENELSE is not the same node as the j on the "T" side. The j found by SR on the "F" side had already been used by the smaller IFTHENELSE (I = LASTFJ condition was true). Therefore, SR continues searching for the next j on the "F" side. 153 SR will continue looking for a j to match JI until SR hits the "stop" oval. In Fig. 5-32, I is tested to see whether it is an oval. If I is an oval, JI = LASTTJ, and case (DI) = 0, and JI f DI, then I takes on the value of JI, J I is zeroed, all the diamonds in FLIST are unmarked, and FLIST is zeroed. SR goes to (2) of Fig. 5-32, skipping the test "Is I a j?". SR continues down the "T" side of the IFTHENELSE, searching for the next j. The other situa- tion in which an oval may be hit is one in which a j, found on the "T" side of an IFTHENELSE that is a diamond, is not the j of the IFTHENELSE. JI points to this node. While SR is traversing the "F" side of the IFTHENELSE, SR will never find a j equal to JI. Eventually, SR will hit the "stop" oval. In Fig. 5-32, if I is an oval, case (DI) = 0, JI f 0, JI f DI, and JI is a diamond, then I takes on the value of JI, JI is zeroed, all diamonds in FLIST are unmarked, FLIST is zeroed, and SR goes to (6) in Fig. 5-35 to handle the loop. This concludes the details of the way that SR handles an IFTHENELSE. The diagram below summarizes the IFTHENELSE. SR searches the "T" side of DI until it finds a j. SR then searches the "F" side of DI until it finds a j. If these two j's are the same node, SR has found an IFTHENELSE. If these two j's are not the same node, then the next j on the "F" side is found. If the two j's are not the same, either the next j on the "F" side is found, or SR hits the "stop" oval. If SR hits the "stop" oval, SR searches for the next j on the "T" side and begins searching for the first j on the "F" side. Eventually, the correct j for the 154 Fig. 5-41. Summary of SR's Search for J's in an IFTHENELSE IFTHENELSE is found, or all the possibilities are exhausted. If it is the latter, SR tries a DOWHILE by incrementing the case of DI by one. Any time SR finds a diamond, it must push the stack and try to show that the new diamond is part of an IFTHENELSE or a DOWHILE. After succeeding, SR pops the stack and continues. Since every j of an IFTHENELSE may be a j for a later DOWHILE or IFTHEN- ELSE, care must be taken in continuing after an IFTHENELSE. 155 DOWHILE which Loops on "T" (case = 1) After SR fails to find an IFTHENELSE, SR tries to find a DOWHILE which loops on "T" (case =1). SR finds the first j. If this j is DI (cf. Fig. 5-34), SR has found a DOWHILE (case =1). If it is not the correct j, the j must have been the result of find- ing a DOWHILE or an IFTHENELSE inside the loop. If the j is not part of an IFTHENELSE (I = LASTTJ is not true) and I is a diamond (cf. Fig. 5-34), then the j may be part of a DOWHILE. If the j is not part of a DOWHILE or an IFTHENELSE, then the diamond DI cannot be part of a DOWHILE which loops on "T." The next and last case is tried (case = 2). Below is a flowchart segment with a DOWHILE which loops on "t." The "a" is assumed to be smooth. If SR hits Fig. 5-42. A Flowchart Segment with a DOWHILE which Loops on "T" an oval, then the diamond cannot be part of a DOWHILE which loops on "T," and SR tries the next case. DOWHILE which Loops on "F" (case = 2 ) If the diamond is not part of an IFTHENELSE or a DOWHILE 156 which loops on "T," the diamond must be a DOWHILE which loops on "F," or the flowchart is unsmooth. For case (DI) = 2 (cf. Fig. 5-32), SR takes the "F" side of the diamond (I «- LL(DI)) and finds the first j. If this j is equal to DI , SR has found a DOWHILE which loops on "F." Below is a DOWHILE which loops on "F" and its list structure. The "b" is assumed to be smooth. If SR hits a j which Fig. 5-43. A Flowchart Segment with a DOWHILE which Loops on "F" is not the correct j for DI, the j must correspond to an IFTHENELSE (I = LASTTJ in Fig. 5-34) or a DOWHILE ("Is I a diamond" in Fig. 5-34) inside the loop. If the j does not correspond to part of an IFTHENELSE or a DOWHILE, SR halts and tells the student that the flowchart is unsmooth and why it is unsmooth. If SR hits an oval during case = 2, the flowchart is also unsmooth. Embedded Diamonds In the process of determining the case of a diamond, SR will find other diamonds. Two situations arise: either SR knows 157 that this new diamond must be a loop (cf. Fig. 5-35), or SR has no knowledge of the new diamond and must assume it can be any of the three cases (cf. Fig. 5-36). The only difference between the flow- chart segment in Fig. 5-35 and the flowchart segment in Fig. 5-36 is that Fig. 5-35 assumes that the diamond is case 1 or 2 and Fig. 5-36 assumes the diamond can be case 0, 1, or 2. Since Fig. 5-35 and Fig. 5-36 are very similar, only Fig. 5-36 will be discussed. In Fig. 5-36 if a diamond (I) has been found, SR checks to see whether I has already been marked. If I is marked, then the flowchart is unsmooth. The discussion of why the flowchart is unsmooth will appear in Section 5.4.7. Assuming I is not marked, SR pushes the old DI, JI, LASTTJ, LASTFJ, TLIST, and FLIST into the stack. The newly found diamond I is checked to make sure it is not already in the stack. If I is in the stack, the flowchart is unsmooth. The discussion of why the flowchart is unsmooth will appear in Section 5.4.7, the "Termination of the Smooth Checker." Assuming that I is not in the stack, SR sets the needed initializ- ing conditions (JI * 0, case (I) +• 0, and DI «- I) and calls SR. If SR is successful in showing that the diamond is smooth, SR re- turns, I is set to DI, and the diamond DI is marked "finished." A temporary pointer QX points to the diamond that is DI before the pop (QX «- I). QX is added after the pop to TLIST or FLIST, depend- ing on whether JI equals 0. A temporary pointer P points to the j of the diamond that SR found to be smooth. After the pop and 158 depending on whether J I equals 0, LASTTJ or LASTFJ is set to P. If the case of the diamond SR just found to be smooth is zero, I takes on the value of JI. If case (I) = 1, SR takes the nonlooping side ("F" side) of DI. If case (I) = 2, SR takes the nonlooping side ("T" side) of DI. With QX, P, and I set, SR pops DI, JI, LASTTJ, LASTFJ, TLIST, and FLIST off the stack. If JI is not equal to zero, the DI just popped is an IFTHENELSE and SR is on the "F" side of the IFTHENELSE. If JI is not equal to zero, then LASTFJ and FLIST are updated, since SR just found a diamond on the "F" side; other- wise, LASTTJ and TLIST are updated. SR continues at 1 in Fig. 5-32. 5.4.7 Termination of the Smooth Checker The Smooth Checker must always terminate, whether the student's flowchart is smooth or not. If the Smooth Checker did not terminate, the student would wait forever for PASF to decide whether his/her flowchart is smooth or not. If the student's flow- chart is smooth, termination occurs when SRI finds the "stop" oval. If the student's flowchart is not smooth, SR could search forever if it were not for two checks (cf. Fig. 5-35). Every time SR finds a new diamond, SR checks first that the diamond has not been marked finished and, secondly, that the diamond is not already in the stack. These two checks, which are discussed in detail in this section, guarantee that the Smooth Checker will always terminate. 159 In Fig. 5-36, whenever SR finds the j for a diamond and returns, SR marks the diamond "finished." If SR finds a diamond that has been already marked "finished," SR halts and gives the message to the student that his/her flowchart is unsmooth. Below is an example in which this check is needed. Without marking each TsK , Fig. 5-44. A Flowchart in which the Mark Check is Needed 160 "finished" diamond and checking for the mark every time a diamond is hit, the example above would be an infinite sequence of IFTHENELSEs (bcdcdcd- •• ). SR takes the "T" side of the diamond "a" and finds a diamond "b." SR finds successfully that "b" is an IFTHENELSE and marks "b" finished. Since "c" is a j, a diamond, and the j could result from the IFTHENELSE of "b," SR assumes "c" can be part of an IFTHENELSE. SR finds successfully that "c" is an IFTHENELSE and marks "c" finished. SR finds "d" is an IFTHENELSE and marks "d" finished. SR hits "c" and finds that "c" is a j and a diamond and the j could result from the IFTHENELSE of "d." SR would again assume "c" is a part of an IFTHENELSE, except that "c" has been marked finished (cf. Fig. 5-36). SR halts and outputs a message to the student saying that his/her flowchart is unsmooth because of a bad loop in a branch of the diamond "a." In certain situations, SR must unmark some of the dia- monds it has marked. Every time SR matches a j for a diamond, SR marks the diamond finished. If SR fails in its attempt to show that a diamond is an IFTHENELSE, SR must unmark all the diamonds it marked trying to show the diamond was part of an IFTHENELSE. For each DI there are two lists kept of the diamonds that have been marked. One list (TLIST) contains all the diamonds marked on the "T" side of an IFTHENELSE. The other list (FLIST) contains all the diamonds marked on the "F" side of an IFTHENELSE. These two lists, 161 together with the corresponding DI , are pushed and popped in the stack. If the j has to be moved on the "T" side, SR restarts the search for the j on the "F" side. When this occurs, all the dia- monds in FLIST are unmarked and FLIST is zeroed. When SR is trying to find a DOWHILE, the marked diamonds are placed in the TLIST. Each time SR increments the case, all the diamonds in the TLIST and FLIST are unmarked and the two lists are zeroed. The two lists TLIST and FLIST allow SR to unmark the diamonds necessary when SR backs up in looking for another j in the IFTHENELSE, and when SR tries a new case. SR progresses through the list structure, systematically marking each diamond node it finishes. Since there are a finite number of diamond nodes, SR will eventually exhaust all the dia- mond nodes and terminate, unless SR tries to redo an unfinished diamond node. The check of the stack (to be discussed next) guar- antees that an unfinished diamond node is not redone. In Fig. 5-36, if a diamond has been found, the old DI, etc., are pushed into the stack. SR checks to make sure that the newly found diamond (I) is not a DI in the stack. If I is in the stack, the flowchart is unsmooth. If I is in the stack, then SR must have been working on, for example, diamond "a," found another diamond, "b," and pushed "a" into the stack. Before the diamond "b" is shown to be smooth and popped from the stack, SR hits "a" again. Therefore, the flowchart must be unsmooth, because "a" 162 Fig. 5-45. The Situation if I is in the Stack forms a loop which is not a DOWHILE, since everything inside is not smooth. Below is an example in which this stack check is required. Fig. 5-46. Example in which Stack Check is Needed 163 Without the stack check, SR would find that the above flowchart is an infinite number of loops (h, g, h, g, h •••)• Since "h" is a j, a diamond, and there is no IFTHENELSE in front of "h" to cause a j, SR assumes "h" must be part of a DOWHILE. SR continues on the "T" side of "h" and finds "g," which is also a j and a diamond. SR pushes "h" into the stack. Again, since there is no IFTHENELSE in front of "g" to cause a j, SR assumes "g" must be part of a loop. SR continues on the "T" side of "g" and finds that "h" is a j and a diamond. Again, since there is no IFTHENELSE in front of "h" to cause a j, SR would assume that "h" must be part of a loop, except that "h" is already in the stack. Therefore, the flowchart is un- smooth. The stack check guarantees that any unfinished diamond is not redone. The stack check, together with the marking of all the finished diamonds, guarantees that the Smooth Checker always terminates. 5.5 Regular Expression Representation Generator The routine which generates the Regular Expression (RE) representation of the student's smooth flowchart (cf. Section 3.6.4) is a modified version of the Smooth Checker of Section 5.4. The RE generator is simpler, since it does not need to traverse un- smooth flowcharts. The Smooth Checker has set the proper "case" for each diamond (case = is an IFTHENELSE; case = 1 is a DOWHILE which loops on "T"; and case = 2 is a DOWHILE which loops on "F"). The RE generator traverses the smooth flowchart with a recursive 164 algorithm tailored after the smooth checking algorithm. There is no theoretical reason why the RE representation could not be gen- erated during the smooth checking. The division into two tasks was done for the practical reasons of efficiency and easy coding. As routine SR2 (cf. Fig. 5-47 and Fig. 5-48) traverses the list structure, the contents of each rectangle visited is outputted to an array. If a diamond is found, a "(" is outputted and the routine SRA is called. When SRA returns, SR2 outputs a ")" if an IFTHENELSE (case = 0), or ")*[," contents inside the diamond, and "]" if a DOWHILE (cases 1 and 2). If SR2 hits the "stop" oval, the routine is done. The flowchart of routine SR2 is shown in Fig. 5-47 and Fig. 5-48. Routine SRA is shown in Fig. 5-49 through Fig. 5-52 with approximately the same format as Fig. 5-32 through Fig. 5-36 for the SR routine of the Smooth Checker. Several portions of SR are missing in SRA: the error portions, the part to try the next case, and the two lists TLIST and FLIST. No marking of the diamonds or checking the stack is necessary for SRA, since the flowchart is known to be smooth. Since the case of DI is known, Fig. 5-35 and Fig 5-36 of SR have been combined into Fig. 5-52 of routine SRA. SRA has extra portions to output the RE representation into the array. To output the RE representation in the correct order, SRA places the partial strings into a BUFFER, then moves the strings in the BUFFER to the array at the appropriate moment. If 165 start zero stack, array, BUFFER, N, QY I <- start node I I +■ RL(I) output contents of rectangle I JI «- DI <- D *- I stop I «- RL(I) output "(' I call SRA Fig. 5-47. Part I of Flowchart of Routine SR2 166 output "*[" ^utput content* in diamond I Fig. 5-48. Part II of Flowchart for Routine SR2 167 I ♦ RL(DI) 4^ 'ca ^ I *■ LL(DI) Dutput content^ f rectangle I iremove in BUFFER "+" of 31 and every- thing to right remove in BUFFER "+" of DI and every- thing to right I <- RL(I) Fig. 5-49. Part I of Flowchart of Routine SRA 168 move BUFFER to end of array move BUFFER to BUFFER on top of stack 4 return place in BUFFER I place in BUFFER contents of diamond I i place in BUFFER "]' I 'l t LL(D) \+Q I + JI JI <- I remove in BUFFER "+" of DI and every- thing to ri Fig. 5-50. Part II of Flowchart of Routine SRA 169 move BUFFER to end of array move BUFFER to end of BUFFER at top of stack 1 return Fig. 5-51. Part III of Flowchart of Routine SRA 170 Push (DI, JI, LASTTJ, LASTFJ, BUFFER) I + RL(I) output "(" I - LL(I) * Pop (DI, JI, LASTTJ , LASTFJ, BUFFER) output ")" output contents of diamond QX I output "]" Fig. 5-52. Part IV of Flowchart of Routine SRA 171 the diamond is a DOWHILE at level 0, SRA does not need the BUFFER and outputs the symbols directly to the array. For other instances, i.e., case = or level greater than 0, the symbols are placed into the BUFFER. If SRA finds a diamond, SRA outputs a "(," pushes DI, JI, LASTTJ, LASTFJ, and BUFFER into the stack, and calls SRA. Upon returning, SRA outputs a ")." If the diamond is a DOWHILE, SRA outputs "*[," contents of the diamond, and "]." SRA pops the stack and continues. If a diamond is an IFTHENELSE (case = 0), SRA (cf. Fig. 5-50) finds the first j on the "T" side of the diamond DI. At this point, SRA places "+[," contents of the diamond, and "]" in the BUFFER. SRA continues taking the "F" side of the diamond DI (I <- LL(DI)). If SRA finds an oval or a j on the "F" side of DI which is not the same node as JI, SRA must erase from the BUFFER the "+" of the DI and everything after the "+." This allows SRA to back up and try the next j on the "T" side of DI. SRA does not remove "+," etc., from BUFFER if the j on the "F" side could be the j of an IFTHENELSE just "before" (I = LASTFJ in Fig. 5-50) or the j of a DOWHILE (I is a diamond). When SRA finds the j of an IFTHENELSE, SRA either con- catenates the contents of the BUFFER to the end of the array or concatenates the contents of the BUFFER (level N) to the end of the BUFFER at the top of the stack (level N-l). SRA does the former concatenation if level-0, or if level-! and the level-0 172 diamond is not an IFTHENELSE. If SRA hits a diamond which is a DOWHILE (case = 1 or case = 2), SRA searches for the j for the DOWHILE. If SRA finds a DOWHILE, SRA either concatenates the con- tents of the BUFFER to the end of the array or concatenates the contents of the BUFFER (level N) to the end of the BUFFER at the top of the stack (level N-l). SRA does the former concatenation if level 1 and the level-0 diamond is not an IFTHENELSE. When routine SR2 finds the "stop" oval, the RE represen- tation is in the array and complete except for one minor detail. For all the diamonds that are a DOWHILE which loops on "F" (case = 2), the condition inside the diamond is negated in the array (e.g., "a > 10" would become "a < 10"). 5.6 Algorithm for Translator Check As part of the deduction scheme (cf. Section 3.7), the two translators TRANSLATOR B and TRANSLATOR R interchange state- ment types (e.g., add Oab:c;), P's, or strings of statement types and P's (cf. Section 3.7.7). This interchange is only possible if the three conditions of Section 3.7.7 are satisfied. This section will discuss the algorithm which efficiently implements these three conditions in PASF. The general case is AB interchanged to BA where A and B are strings of "items." An item is either 1. A statement type with its sequence number, list of input variables and constants, list of output var- iables, and list of maybe-output variables, or 173 2. A P with its sequence number, list of input variables and constants, list of output variables, and list of maybe-output variables. Below are examples of items. P3 elf:c;d sub mn:z; Fig. 5-53. Examples of Items to be Interchanged The three conditions of Section 3.7.7 need four entities: 1. The input variables of A, 2. The output and maybe-output variables of A, 3. The input variables of B, and 4. The output and maybe-output variables of B. Since the number of variables that can be used by the student is less than 60, a computer word (60 bits longs) is used for each of these four entities. These four computer words are called IV (A), 0V(A), IV (B) , and 0V(B) respectively. When an input variable (e.g., its name is ninth in the symbol table) is found in A, the correspond- ing bit (ninth bit) is set to one in IV(A). The three other com- puter words are similarly used. On one pass of the string AB, the four words can be completely set. To test the three conditions (cf. Section 3.7.7), the following IF statement is performed. IF(IV(A)MASK 0V(B)=0)AND(IV(B) MASK 0V(A)=0)AND(0V(A) MASK 0V(B)=0) THEN 0K;ELSE FAIL; The MASK function is a bit-wise "and" of the two computer words. 174 If computer word x and computer word y have no bits in correspond- ing positions set to one, then x MASK y is equal to 0. The above test is performed \/ery quickly by the machine. The test to allow the interchange AB to BA is quickly performed. The test requires one pass of the string AB and a single computation of a boolean expression. 175 Chapter Six CONCLUSIONS AND FUTURE RESEARCH 6.1 Conclusions The feasibility of the Semantic Formulation Method (SFM) to grade students' programs by machine has been demonstrated in this thesis. SFM is an efficient approach which shows that a student's structured program (i.e., a limited control structure with DOWHILEs and IFTHENELSEs) is correct. To show that the stu- dent's program is correct, SFM demonstrates that the student's pro- gram is Global Semantic Equivalent (GSE) (cf. Section 2.4) to the instructor's correct answer. Global Semantic Equivalence is com- putational equivalence with local syntactic transformations which are semantically equal, and with interchanging of independent proc- esses. The feasibility of SFM is proven in the previously described program PASF. PASF, a working program on the PLATO IV system that implements in a rudimentary way the notions of SFM, has been de- signed to operate in the computer-based educational environment of PLATO IV. SFM is especially well suited for the computer-based educational environment because of its efficiency in the use of storage and computation and its ability to interact with students concerning their programming errors. SFM has been shown to be a viable method of grading student generated structured programs in the PLATO IV computer-based education environment. 176 PASF's overall goal is to teach the step-wise refinement technique of structured programming to students in an introductory computer science course. The teaching of this technique is im- proved if the programming language used by the students allows only a limited control structure which involves only DOWHILEs and IFTHENELSEs. Therefore, PASF is an ideal application for SFM which requires structured programs using DOWHILEs and IFTHENELSEs. Since PASF teaches introductory nonmajor students in the first or second week of a TOO-level course, it utilizes flowcharts. The constraints of using structured programs and using flowcharts evolved a class of flowcharts called smooth flowcharts. In the process of teaching the step-wise refinement technique, PASF grades smooth flowcharts by the efficient SFM method. The value of SFM lies in its ability to represent a student's structured program as a string (Regular Expression repre- sentation, cf. Section 3.6.4). Representing a program as a string allows magnitudes of speed-up in the required computation, as com- pared with other representations, e.g., a list structure representa- tion. Representing a problem as a string to obtain efficiency, and at the same time possibly destroying the structural properties, was a common pitfall of Artificial Intelligence (AI) programs of the early Sixties. McCarty [McCarty, 1971] stated that representations that eliminated the inherent structure of the problem was the chief reason for the failure of GPS [Ernst and Newell, 1969] and other 177 AI progams. In SFM's case, however, the string representation has not destroyed the inherent structure of the student's program. Enough information is in the string (RE representation) to recon- struct the student's flowchart if necessary, since no structural information is lost. Further, the string representation is ef- ficiently stored. Much of the efficiency of SFM in both storage and speed of computation is accredited to the fact that the stu- dent's program is represented as a string. Not impressed by the results of the "syntax-semantic" paradigm of the Sixties, the author decided to incorporate the "description, representation, and deduction" paradigm [Minsky, 1972] in this thesis. Surely, any program which deals with program- ming languages must worry about syntax and semantics. The author does not advocate the elimination of the use of syntax or semantics in an AI program such as PASF; he only suggests that the program not be designed around the "do all the syntax"-then-"do all the semanti cs" -model . The design of PASF follows the "description, representation, and deduction" paradigm currently popular in the AI field. Using this paradigm as a framework for PASF, the descrip- tion of the programming domain in PASF consists of the process boxes with simple assignment, input, and output statements and the control structure smooth flowcharts to connect tne process boxes. A large class of programs can be written in this programming domain, 178 A large portion of the possible description of a programming domain has been purposely neglected: missing from the description are different data types and data structures. Since PASF deals with students in an introductory computer science course, different data types and data structures were eliminated for pedagogic rea- sons. The description of possible programs in PASF consists of process boxes in a smooth flowchart. Each of the three representations of a program utilized in three different portions of PASF is well suited for the neces- sary manipulations in its portion of PASF. No one representation for all three portions of PASF would be as efficient as each repre- sentation is for its own portion. The list structure representa- tion is well tailored to the adding and deleting of boxes and arrows of the input section. The heuristic and algorithm-specific checks are quickly performed on the RE representation. The de- duction scheme runs efficiently on the Standard Regular Expression (SRE) representation. Converting from one representation to another might be expensive in terms of computation. In PASF, converting costs little, since only two conversions are needed each time a student checks his/her flowchart. In the time (less than two sec- onds) the student waits for PASF to check whether his/her flowchart is smooth, PASF also generates the RE representation from the list structure representation. 15 The other conversion which is done 15 This demonstrates a design philosophy used throughout PASF of distributing all possible computations between student inter- actions. 179 at the beginning of the deduction scheme is quick, since the two representations (RE and SRE) are relatively similar. The flexi- bility of three different representations of a student's program allows PASF to perform a diverse set of tasks quickly and ef- ficiently. A representation should be as formal and as structured as possible to allow deductions to be performed efficiently, as well as be general enough to cover a useful description. This is the AI researcher's dilemma, restated by Minsky, "We want. . .for the smallest initial structure, the greatest complexity." 16 A broad description will require a complex representation; in that case, the deduction may be formidable or impossible. The trend in AI is to narrow the problem domain by limiting the description (e.g., Winograd's block world [Winograd, 1971]). The programming domain of PASF is vastly limited (e.g., no procedure calls or pointer var- iables) for this reason. Limiting the problem domain is not enough to guarantee that a deduction scheme will work; an AI program re- quires "good" representations. In this context, the only known way to define a "good" representation is to say that it is "one that works." The search for "good" representations is a major research effort in the AI field today (especially for "good" re- presentations of knowledge [Winograd, 1974]). For PASF's deduction 16 Minsky, Marvin, editor, Semantic Information Processing , The MIT Press, Cambridge, Mass., 1968, p. 12. 180 scheme, the SRE representation (cf. Section 3.7) has been shown to be "good" in the sense discussed above. The deduction scheme of PASF is specifically tailored to deduce that two programs are equivalent. For reasons of effi- ciency, the deduction scheme is not a general scheme (e.g., a scheme over first order predicate calculus), but a special purpose scheme to handle only pieces of a structured program. The design of PASF's deduction scheme is modeled after the general theorem provers based on the' resolution principle [Robinson, 1965; Henschen, 1971]. Many portions of PASF's deduction scheme are analogous to portions of a general theorem prover. PASF's deduction scheme consists of a SELECTOR, MATCHOR, REDUCOR, and others (cf. Fig. 3- 11). The SELECTOR together with the MATCHOR builds a Semantic Model. The Semantic Model consists of pieces (P's) of the student's program which are individually semantically equivalent to pieces (P's) of the instructor's program. The set of a P in the student's program and the corresponding semantically equivalent P in the instructor's program is analogous to an axiom in the general theorem prover. {Pk of student is semantically equivalent to Pk of instructor} ■+ T {Axiom A,} -> T Fig. 6-1. Analogy of P's of PASF Deduction Scheme and Axioms of Theorem Prover 181 Each set in the above figure implies Truth. The REDUCOR searches for inconsistencies in the Semantic Model. The REDUCOR reduces two P's in the student's program and two P's in the in- structor's program to one P in each. To continue the analogy: a reduction step is analogous to resolving two axioms. Similar to the theorem prover which can resolve two resolved axioms (clauses), the REDUCOR can reduce two reduced P's. In a theorem prover, selecting the next two clauses to resolve requires strategies, e.g., unit preference. Likewise, the REDUCOR has different possible ways to reduce two P's and needs a strategy to choose which two P's to reduce next. If a theorem prover can deduce an empty clause, the theorem has been proved. Similarly, if the REDUCOR can reduce the two SREs to a single P in each, then the student's program has been shown to be correct. The analogy, although not perfect overall and weak in many places, is a useful tool for pointing out areas of future research on PASF-like systems. The deduction scheme of PASF deduces that two programs are equivalent by forming a Seman- tic Model and reducing the Semantic Model to two single P's. In the process of reducing to a single P, the deduction scheme checks for consistency between pieces of the student's and the instructor's programs. As the REDUCOR reduces two P's, the variables of the student and the instructor are dynamically bound and checked for use before being given a value. In case of an in- consistency, the REDUCOR tries to move processes in the student's 182 program around to achieve a closer fit to the instructor's program. The REDUCOR via TRANSLATOR R may succeed in moving any independent process of arbitrary size, e.g., two loops. The whole deduction scheme of PASF can be viewed as a heuristic search through a state space (for an excellent discussion of heuristic searches through state spaces, see [Nilsson, 1971]). The SELECTOR chooses a path through this state space, forming a Semantic Model. When the REDUCOR shows that the Semantic Model r has an inconsistency, the deduction scheme backtracks and the SELECTOR chooses a different branch in the state space. When the SELECTOR exhausts all paths through the state space, the deduction scheme halts, giving error diagnostics to the student. In any AI program, the size of this state space is critical: if it is too large, combinatorial explosion forces the search of the space to go for hours; if it is too little, the program is not very intel- ligent. PASF, to rid the state space of unprofitable branches (this process is called tree pruning), executes the heuristic checks and the algorithm-specific checks. By this pruning PASF is as- sured that the student's program is reasonably close to the in- structor's program. By tuning the SELECTOR, the state space can be enlarged if it is too small. Since searching this state space re- quires a large share of the CPU time, the size of the state space is closely related to the length of time that the student must wait for PASF to grade his/her flowchart. 183 The SFM method as demonstrated by PASF is a useful tech- nique for grading students' structured programs. The SFM method should not be used for program verification, which is somewhat re- lated to grading, but different from it: grading involves comparing the student's program with the correct answer and giving feedback on the errors; program verification involves proving a program is correct for the program's specifications. Although grading and program verification both want the program to be equivalent to the correct entity, the equivalences are of a different type. Grading results in Global Semantic Equivalence (close to computational equivalence), while program verification results in input/output equivalence. PASF-like graders which can say that a student's program is correct will become more important in the future as more and more students are enrolled in introductory computer science courses. 6.2 Future Research As noted above, there is an increasing need for PASF-like graders. This need is especially acute in the teaching of program- ming in the computer-based education environment. Naively, one may think that using a computer-based education system to teach program- ming should be easy, i.e., "a computer to simulate a computer." In reality, teaching programming via computer-based education is as hard as or harder than teaching other subjects, e.g., French or 184 history. "A computer to simulate a computer programmer" is a better metaphor (informally proposed by Dr. Donald Gillies at the Univ- ersity of Illinois). Since one activity of many computer program- mers is grading student programs, an area of research that needs further investigation is describing and modeling the processes in- volved in a human grader. PASF has been attempting to incorporate at a gross level some of the processes of one human grader. The heuristic checks and algorithm-specific checks were derived from the author's ex- perience in grading student programs. PASF is only a start in the right direction: it can be extended and improved within the frame- work of the SFM method described above. An easy improvement is to include more powerful heuristic checks or algorithm-specific checks, e.g., run the program with test data or find and compare all the traces through the two programs. Data types and data structures, e.g., arrays, should be investigated. The SELECTOR can be made more intelligent. Currently, the SELECTOR is statement or box- oriented: its orientation could be groups of statements instead. The handling of DOWHILEs and IFTHENELSEs by the deduction scheme could be made more general. Many features of PASF can be extended, but most require more storage and computational time. PASF is currently near the saturation point in storage and computation time for the PLATO IV system. Moving the SFM method out of the computer-based education 185 environment into a purely research environment would open the doors to the investigation of many research problems. For an example, one research problem is implementing the SFM method or a descendant version of SFM in a larger programming domain such as a subset of PASCAL. 186 BIBLIOGRAPHY Adams, J. M., "Teaching Declarative Programming," SIGCSE Bulletin , Feb. 1975, Vol. 7, No. 1, pp 83-85. Aho, A. V., "Design of Efficient Algorithms," Department of Computer Science Colloquium, University of Illinois, Nov. 21, 1974. Aho, A. V. and Ullman, J. D. , "Transformations on Straight Line Programs," Conf. Record Second Annual ACM Symposium on Theory of Computing , pp 136-148 (May, 1970). Allen, F. E. , "Program Optimization," Annual Review in Automatic Programming , Vol 5 (1969). /- Alpert, D. and Bitzer, D. L. , "Advances in Computer-based Education" in Science , 167 (1970), pp. 1582-1590. Arnborg, Stefan, "A Note on the Assignment of Measurement Points for Frequency Counts in Structured Programs," BIT 14, (1974) pp. 273-278. Barta, Ben Zion and Nievergelt, Jurg, "An Interactive System for Automatic Examination of Programming Skills (ISAEP)" to be published 1975. Bernstein, A. J., "Analysis of Programs for Parallel Programming" IEEE Transactions on Computers , Vol. EC-15, No. 5 (Oct, 1966) pp. 757-763. Biss, K. , Chien, R. and Stahl , F. , "R2-A Natural Language Question- answering System," AFIPS Conference Proceedings , Spring Joint Computer Conference, 1971, AFIPS Press, Montvale, N. J. Bohm, Corrado and Jacopini, Giuseppe, "Flow Diagrams, Turing Machines and Languages with Only Two Formation Rules," Comm. of ACM , May, 1966, Vol. 9, No. 5. Carbonell, Jaime R. , "AI in CAI: An Artificial Intelligence Approach to Computer-Assisted Instruction," IEEE Transactions on Man-Machine Systems , Dec, 1970. Charniak, Eugene, "Toward a Model of Children's Story Compre- hension," AI TR-266, MIT Artificial Intelligence Laboratory, Cambridge, Mass., 1972. Cooper, D. C. , "Bohm and Jacopini 's Reduction of Flowcharts," Comm. of ACM , Vol. 10, No. 8 (August, 1967). Dahl, 0. J., Dijkstra, E. W. and Hoare, C.A.R., Structured Program - ming , Academic Press, New York, 1972. 187 Danielson, Ronald L. , Nievergelt, Jurg, "An Automatic Tutor for Introductory Programming Students." SIGCSE Bulletin , Feb. 1975, Vol. 7, No. 1, pp. 47-50. Denning, Peter J., "Guest Editor's Overview," Computing Surveys Vol. 6, No. 4, Dec. 1974, pp. 209-211. Derksen, J. A., Rulifson, J. F. , Waldinger, R. J., "The QA4 Language Applied to Robot Planning," AFIPS Conference Pro- ceedings , Vol. 41, Part 2, FJCC, 1972. Dijkstra, Edgsger W. , "GOTO Statement Considered Harmful," Comm . of ACM , Vol. 11, No. 3, March 1968. Dijkstra, EdgsgerW. , "Notes on Structured Programming," T. H. Re- port 70 WSK-03, 2nd Edition, Technological University, Eindhoven, Netherlands, April 1970. Dijkstra, Edgsger W., "The Humble Programmer," Comm. of ACM , Vol. 15, No. 10, Oct. 1972. Ernst, G. W. and Newell A., GPS: A Case Study in Generality and Problem Solving , Academic Press, New York, 1969. Floyd, Robert W. , "Assigning Meaning to Programs," Proc. Symp. Appl. Math. 19, in J. T. Schwartz (ed.) Mathematical Aspects of Computer Science , American Mathematical Society, Providence, R. I., 1967. Forsythe, G. E. and Wirth, N. , "Automatic Grading of Programs," Comm. of ACM , 8 (1965), pp 275-278. Foulk, Clinton R. , "Smooth Programs," a talk at Computer Science Conference 1973, Columbus, Ohio, Feb. 21, 1973. Gerhart, Susan L. , "Methods for Teaching Program Verification," SIGCSE Bulletin , Vol. 7, No. 1 (1975), pp. 172-176. Gries, David, "On Structured Programming—A Reply to Smoliar," Comm. of ACM , Vol. 17, No. 11, Nov. 1974, pp. 655-657. Gries, David, "Research in Programming and Programming Languages," talk at ACM Computer Science Conference '75, Washington, D. C. , Feb. 18, 1975. Henschen, Lawrence J., "A Resolution Style Proof Procedure for Higher-Order Logic," DCL Report #452, Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois, 1971. 188 Hewitt, Carl, "Procedural Embedding of Knowledge in PLANNER," Proceedings of the Second International Joint Conference on Artificial Intelligence , London, 1971. Hoare, C.A.R. , "Proof of a Program: FIND," Comm. of ACM , 14, 1, Jan. 1971, pp. 39-45. Hopcroft, John E., Ullman, Jeffrey D. , Formal Languages and their Relation to Automa , Addi son-Wesley Publishing Company, Reading, Mass. , 1969. Hyde, Daniel C. , "Instructor's Manual for PASF," DCL Report UIUCDCS-R-75-744, Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois, 1975. Igarashi , S. , London, R. L. and Luckham, D. C, "Automatic Program Verification I: Logical Basis and Its Implementation," Computer Science Report 365, Stanford University, Stanford, Cal., May, 1973. Kasai , Takumi, "Translatability of Flowcharts into WHILE Programs," Journal of Computer and System Science , 9 (1974), pp. 177-195. Katz, S. M., and Manna, Z. , "A Heuristic Approach to Program Veri- fication." Third International Joint Conference on Artificial Intelligence, Stanford, Cal. (August, 1973), pp. 500-512. Keller, Robert M. , "A Solvable Program-Schema Equivalence Problem," Fifth Annual Princeton Conference on Information Science and Systems , March, 1971. Knuth, Donald E., "Structured Programming with GOTO Statements," Computing Surveys , Vol. 6, No. 4, Dec. 1974, pp. 261-301. Knuth, Donald E., The Art of Computer Programming , Vol. 1, 2nd Edition, Addison-Wesley Publishing Company, Reading, Mass., 1973. Knuth, Donald E. and Floyd, Robert W. , "Notes on Avoiding 'GOTO' Statements," Information Processing Letters , Vol. 1, No. 1 (Feb., 1971), pp. 23-31, 77. Krause, K. W. , Smith, R. W. and Goodwin, M. A., "Optimal Software Test Planning Through Automated Network Analysis," Record of 1973 IEEE Symposium on Computer Software Reliability , May, 1973, pp. 18-22. 189 Lee, John A. N. , Computer Semantics: Studies of Algorithms , Processors and Languages , Van Nostrand Reinhold Co., N. Y., 1972. London, Ralph L., "The Current State of Proving Programs Correct," National Proceedings of ACM , August, 1972, Vol. I, pp. 39-46. Manna, Zohar, Mathematical Theory of Computation , McGraw-Hill Book Company, New York, 1974. Manna, Z., Ness, S., and Vuillemin, J., "Inductive Methods for Proving Properties of Programs," Proc. of an ACM Conference on Proving Assertions about Programs , SIGPLAN Notices, Vol. 7, No. 1, Jan. 1972. Mateti , Prabhaker, "A Sorting Program Verifier: A Tutoring Sys- tem for Sorting Programs," Ph.D. Thesis Proposal, April 1974, Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois. McCarty, John, Turing Award Lecture, ACM National Conference, Chicago, Illinois, August, 1971. McCormick, B. H. , Ray, S. R. , Smith, K. C. and Yamada, S., "Illiac III: A Processor of Visual Information," DCL Report #183, Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois, June 1965. Michie, Donald (editor), Machine Intelligence I , American Elsevier, New York, 1967. Mills, Harlan D. , "The New Math of Computer Programming," Comm . of the ACM , Jan. 1975, Vol. 18, No. 1. Minsky, Marvin, Editor, Semantic Information Processing , MIT Press, Cambridge, Mass., 1968. Minsky, Marvin and Papert, Seymour, "Research at the Laboratory in Vision, Language, and Other Problems of Intelligence," AI Memo No. 252, MIT, Jan. 1972. Montanari , Ugo G. , "Separable Graphs, Planar Graphs and Web Grammars," Information and Control , 16, pp. 243-267 (1970) May, #3. Nassi, I. and Shneiderman, Ben, "Flowchart Techniques for Struc- tured Programming," SIGPLAN Notices , Vol. 8 #8, August 1973. Naur, P., "Automatic Grading of Student's Algol Programming," BIT , 4 (1964), pp. 177-188. 190 Newell, Allen, Simon, Herbert A., Human Problem Solving , Prentice Hall, Englewood Cliffs, N. J., 1972. Nievergelt, J., Reingold, E. M. , and Wilcox, T. R. , "The Automation of Introductory Computer Science Courses," in A. Gunther, et al . (Editors), International Computing Symposium 1973 , North-Holland Publishing Co., 1974. Nilsson, Nils J., Problem-Solving Methods in Artificial Intelligence , McGraw-Hill, N. Y., 1971. Osterweil, Leon J. and Fosdick, Lloyd D. , "Data Flow Analysis as an Aid in Documentation, Assertion Generation, Validation and Error Detection," Report #CU-CS-055-74 Department of Computer Science, University of Colorado, Boulder, Colorado, 1974. Pfaltz, John L. , and Rosenfeld, Azriel, "Web Grammars," Proceedings International Joint Conference on Artificial Intelligence , Washington, D. C. , May 7-9, 1969. Ray, S. R. , and Preparata, F., "An Approach to Artificial Non- symbolic Cognition," International Journal of Information Sciences , 4 1:65-86, 1972. Robinson, J. A., "A Machine Oriented Logic Based on the Resolution Principle," Journal of the ACM , Vol. 12 (1965), pp. 23-41. Ruth, Gregory R. , "Intelligent Program Analysis," unpublished paper, MIT, Cambridge, Mass., Feb. 4, 1974. Stifle, Jack, "The Plato IV Student Terminal," CERL X-15, Univ- versity of Illinois at Urbana-Champaign, Urbana, Illinois, March 1970. Sussman, Gerald Jay, "Teaching of Procedures," MIT AI Lab., Memo 270, Cambridge, Mass., Oct. 1972. Sussman, G. J., Winograd, T., Charniak, E. , "Micro- PLANNER Re- ference Manual," MIT AI Lab., Memo #203A, Dec. 1971. Turing, Allan M. , "On Computable Numbers with an Application to the Entscheidungsproblem," Proc. London Math. Soc . , 2-42, pp. 230-265, 1936. Waldinger, R. J. and Levitt, K. N. , "Reasoning About Programs," Artificial Intelligence 5 (1974) pp. 235-316. Wegbreit, B., "Heuristic Method for Mechanically Deriving Inductive Assertions," Third International Joint Conference on Artificial Intelligence , Stanford, Cal., pp. 524-536 (August, 1973) 191 Winograd, Terry, "Five Lectures on Artificial Intelligence," AI Lab., Memo AIM-246, Stanford University, Stanford, Cal., Sept. 1974. Winograd, Terry, "Procedures as a Representation for Data in a Computer Program for Understanding Natural Language," Project Mac, MIT, MAC TR-84, MIT, Cambridge, Mass., 1971. Wirth, N.i "Program Development by Step-Wise Refinement," Comm . of ACM , Vol. 14, No. 4 (April, 1971), pp. 221-227. Wirth, N. , "The Remaining Trouble Spots in PASCAL," Department of Computer Science Colloquium, University of Illinois at Urbana- Champaign, April 7, 1975. 192 APPENDIX STEPS IN THE DEDUCTION SCHEME FOR THREE EXAMPLES The Appendix includes three examples of student flow- charts to demonstrate the steps in the deduction scheme (cf. Section 3.7) of the Program to Analyze Smooth Flowcharts (PASF). The three examples are attempts by students to draw a smooth flow- chart for the algorithm "intdivide" (integer divide by the method of successive subtraction). The first flowchart (Fig. A-l) shows the smooth flowchart inputted by the instructor as the "correct" answer to "intdivide." The other three flowcharts (Figs. A-2, A- 5, and A- 11) are attempts by students "maryjane" and "dan" to do the algorithm "intdivide." At the bottom of each flowchart is displayed the Regular Expression (RE) representation (cf. Section 3.6.4) of that flow- chart. This is normally not displayed to students. On the pages following each student's flowchart is dis- played the steps in the deduction scheme (cf. Fig. 3-14) in which PASF tries to show that the instructor's flowchart and the student's flowchart are equivalent. The student's Standard Regular Expression (SRE) (cf. Section 3.7.1) is denoted by "S_. " The instructor's SRE is denoted by " J_. " To facilitate reading, the portion of each SRE which was changed from the line above is underlined. These dis- plays of the steps in the deduction scheme were computer generated by the program specifically for this report and are not shown to the student. 193 student- instructorfllgonthm- intdi vide ( start ) • a<*0 read b read c <^T^> 1 a<>a+ 1 1 r pr i nt a b*b-c r . i . >. ( stop >) pT ' 1 T 1 1 [-> F ) RE>a*0 readb readc (a*a+l b*b-c ) * [b^c ] print a printb Fig. A-l. Instructor's "correct" flowchart to algorithm "intdivide." At the bottom the Regular Expression (RE) representation of the flowchart is displayed. 194 student -mary j ane ftl gor i thm- i ntd i v i de f start J RE = quot£J0 readxl readx2 (qupt<*=quot+l xl<*=xl-> t prirttxl 2 ) * [xl >x2 ] printqu Fig. A-2. Student "maryjane's" attempt at algorithm "intdivide." The flowchart is shown to be correct by the following two pages. At the bottom the Regular Expression (RE) representation of the flowchart is displayed. 195 Entering SELECTOR §= int0:quot; inp:xl; inp:x2;( addquot 1 : quot ; subxlx2:xl 2] out quot: ; outxl : ; I» int0:a; inp:b; inp:c; ( adda 1 : a ; subbc:b;) * [b£c] outa: S- int0:quot; inp:xl; inp;x2; ( Pi quot 1: quot; subxlx2:xl 2] out quot: ; outxl : ; 1= int0:a; inp:b; inp:c; ( Pi al :a; subbc : b ; ) * [b^c] outa: S= int0:quot; inp:xl; inp:x2; ( Pi quot 1 : quot ; P2 xlx2:xl 2] out quot: ; outxl : ; 1= int0:a; *inp:b; inp:c;( Pi al :a; P2 bc:b; ) * [b>c] outa: ) * [xlix outb: ; } * [x 1 >x outb: ; ) * [xl £x outb: ; Entering REDUCOR S= i nt : quot ; l np : x 1 ; i np : x2 ; ( Pi quot 1 : quot ; P2 x 1 x2 : x 1 ; ) * [x 1 ^x 2] out quot: ; outxl : ; 1= int£i:a; inp:b; inp:c;( Pi al:a;. P2 be: b; ) * [b>c] outa:; outb:; S= int.0':quot; inp:*xl; inp:x2;( P3 quot 1x1x2 : quotxl ; ) » [xl>x2] out quot : ; outxl : ; 1= int.@':a; inp:b; inp:c; ( P3 a 1 be : ab ; ) » [b^c] outa:; outb:; Entering SELECTOR S= i nt : quot ; l np : x 1 ; i np : x2 ; P 4 x 1 x 2 quot 1x1x2: ; quo t x 1 out quot : ; outxl : ; 1= int0:a; inp:b; inp:c; P'4 bcalbc:;ab outa:; outb:; S= int0:quot; F5 :xl; inp:x.2; P4 xlx2quot 1x1x2 :; quotxl outquot:; outxl : ; I = i nt @ : a ; F'5 :b; l np : e ; P 4 bca 1 be : ; a b out a : ; outb : ; S= int0:quot; PS :xl; P6 :x2; P4 ■ xlx2quot 1x1x2 :; quotxl outquot:; outxl : ; 1= int0:a; P5 :b; P6 :c; P4 bealbe:;ab outa:; outb:; S= P7 0:quot; P5 :xl; P6 :x2; P4 xlx2quot 1x1x2 :; quotxl outquot:; outxl : ; 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc: ;ab outa:; outb:; Fig. A-3. Steps in the deduction scheme for the flowchart of Fig. A-2. 196 S= P7 0:quot; P5 :xl; P6 :x2; P4 xlx2quot 1x1x2: ; quotxl P8 quot : ; outxi : ; 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc:;ab P6 a: ; outb: ; S= P7 0:quot; P5 :xl; P6 :x2; P4 xlx2quot 1x1x2: ; quotxl PS quot:; P9 xl: ; 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc:;ab P8 a:; P9 b: ; Entering REDUCOR S= P7 0:quot; P5 :xl; P6 :x2; P4 xlx2quot 1x1x2 :; quot xl P8 quot:; P9 xl: ; 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc:;ab P8 a:; P9 b: ; f S= P7 0:quot; P5 :xl; P6 :x2; P4 xlx2quot 1x1x2 :; quot xl Pl.t? quotx iii 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc:;ab Pig ab: ; S = Pll : quotx 1; P 6 : x 2 ; P 4 x 1 x 2 quot 1x1x2:; quot x 1 P 1 quot x 1 : ; •1= Pll 0:ab; P6 :c;, P4 bcalbc:;ab P10 ab: ; S= Pi 2 0: quotx 1x2; P4 xlx2quot 1x1x2 :; quotxl P10 quotxl : ; 1= P12 0:abc ; P4 bcalbc: ;ab P10 ab: ; g= Pi 3 01 : quotx 1x2; P10 quotxl:; I- P13 01:abc; P10 ab: ; S= Pi 4 01 :quotxlx2; 1= P 1 4 01: abc ; Your flowchart is correct for algorithm mt divide. Press -NEXT- to do another algorithm. Fig. A-4. Continuation in the steps in the deduction scheme for flowchart of Fig. A-2. 197 student -dan Algorithm- intdi vide f start ) " quot^fi 3 p read xl ^ f read x2 1 quot4=quot+ 1 RE=quot*0 readxl readx2 (xl$=xl-x2 quot«quot+l ) * [x2^xl ]printqu< t printxl Fig. A-5. Student "dan's" attempt at algorithm "intdi vide." The flowchart is shown to be not correct by the following five pages (diamond should have xl < x2). At the bottom the Regular Expression (RE) representation of the flow- chart is shown. 198 Entering SELECTOR S= int0:quot; inp:xl; inp:x2; ( subxlx2:xl; addquot I : quot ;) * [x2£x 1] outquot : ; outxl:; 1= int0:a; inp:b; inp:c; ( addal:a; subbc:b;) * [b£c] out a: ; outb: ; S* int0:quot; inp:xl; inp:x2; ( subxlx2:xl; P 1 quot 1 : quot ; )■ * [x2 £x 1] outquot:; outxl : ; 1= int.0:a; inp:b; inp:c; ( Pi al :a; subbcrb;) * [b£c] out a: S= i nt : quot ; i np : x 1 ; i np : x2 ; ( P2 xlx2: xl ; P 1 quot 1 : quot 1] outquot:; outxl:; 1= int0:a; *inp:b; inp:c; ( Pi al:a; P2 bc:b;) * [b^c] outa: outb: ; ) * [x2>x outb: ; Entering REDUCOR S= int0:quot; inp:xl; inp:x2; ( P2 xlx2:xl; Pi quot 1 : quot ;)* [x2£x 1] outquot: ;. outxl : ; 1= int0:a; inp:b; inp:c; ( Pi al:a; P2 bc:b;) * [b^c] outa:; outb:; Entering TRRNSLRTOR B Entering REDUCOR after interchange in TRRNSLRTOR B S= i nt : quot ; i np : x 1 ; l np : x2 ; ( P 1 quot 1 : quot ; P 2 x 1 x 2 : x 1 ; ) * [x2 £x 1] outquot:; outxl:; I = i nt 9 : a ; i np : b ; i np : c ; ( P 1 a 1 : a ; P2 be : b ; ) * [b^c] outa : ; out b : ; S = i nt : quot ; l np : x 1 ; i np : x 2 ; ( P 3 quot 1x1x2: quot x 1 ; ) * [x 2 2;x 1 ] out quot : ; outxl : ; I = i nt : a ; i np : b ; i np : c ; ( P3 albcjab; ) * [b £c] out a : ; out b : ; Entering SELECTOR S= mt0:quot; inpcxl; ihp:x2; P 4 v 2 x 1 gu o 1 1 -* 1 x 2 : : qu o t ■< 1 outquot:; outxl : ; 1= intfl:a; inp:b; inp:c; P4 boa 1 be: ; ab outa:; outb:; S= i nt0 : quot ; P5 :xl; i np : x2 ; P4 x2x 1 quot 1 x 1 x2 : ; quot x 1 outquot : ; outxl : ; 1= int0:a; P5 :b; inp:e; P4 bcalbc: ;ab outa:; outb:; Fig. A-6. Steps in the deduction scheme for the flowchart of Fig. A-5. 199 S- int0:quot; P5 :xl; P6 fx2; P4 x2xlquot 1x1x2: ; quotxl outquot:; outxl : ; X- int0:a; P5 :b; P6 :c; P4 bcalbc:;ab outa: ; outb:; S« P7 0:quot; P5 :xl; Pb :x2; P4 x2x 1 quot lxjx2 :; quotxl outquot:, outxl : ; I» P7 0;a; P5 :b; P6 :c; P4 bcalbc: ;ab outa:; outb:; §= P7 0:quot; P5 :xl; Pb :x2; P4 x2xlquot 1x1x2: ; quotxl F'8 quot : ; outxl : ; 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc: ; ab P8 a: ; outb:; S- P7 0:quot; P5 :-xl; P6 :x2; P4 x2xlquot 1x1x2 :; quotxl P8 quot:; P9 xl : ; 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc: ; ab P8 a: ; P9 b: ; Entering REDUCOR S= P7 : quot ; P5 : x 1 ; P6 : x2 ; P4 x2x 1 quot 1x1x2:; quot x 1 P8 quot : ; P9 xl: ; 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc: ; ab P8 a:; F'9 b: ; S= P7 0:quot; P5 :xl; P6 :x2; P4 x2xlquot 1x1x2 :; quotxl Pl.g quotx 1 : : 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc: : ab Pig ab: : S= Pll 0: quotxl; P6 :x2; P4 x2xlquot 1x1x2: ; quotxl P10 quotxl:; 1= Pll : ab ; P6 :c; F'4 bcalbc: ; ab F'10 ab: ; S= Pi 2 0: quotx 1x2; P4 x2xlquot 1x1x2 :; quotxl P10 quotxl:; 1= Pi 2 0:abc; P4 bcalbc: ; ab P10 ab: ; Backtrack has taken place; stack popped S= P7 0:quot; P5 :xl; Pb :x2; P4 x2xlquot 1x1x2 :;, quotxl outquot:; outxl : ; 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc: ;ab outa:; outb:; Fig. A-7. Continuation in the steps in the deduction scheme for flowchart in Fig. A- 5. 200 Entering SELECTOR S= P7 0:quot; P5 :xl; R6 :x2; P4 x2xlquot 1x1x2: ; quotxl outxl : ; _o utquot : ; 1= P7 0:a; P5 :bj P6 :c; P4 bcalbc: ;ab outa: ; outb: ; S= P7 0:quot; P5 :«xl; P6 :x2; P4 x2xlquot 1x1x2: ; quotxl PS xl: ; o utquot : ; 1= P7 0:a; P5 ;b; P6 :c; P4 bcalbc: ;ab PS a: ; outb:; S= P7 0:quot; P5 :xl; P6 :x2; P4 x2xlquot 1x1x2 :; quotxl PS xl:; P 9 quot : ; 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc: ; ab PS a:; P9 b: ; Entering REDgCOR S= P7 0:quot; P5 :xl; P6 :x2; P4 x2xlquot 1x1x2 :; quotxl P8 xl: ; P 9 quot : ; 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc: ; ab P8 a:; P9 b: ; S= P7 0:quot; P5 :xl; P6 :x2; P4 ■ x2xlquot 1x1x2 :; quotxl PlJg xlquo t: ; 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc: ; ab P 1 ab : ; S= Pi 1 0: quotxl; P6 :x2; P4 x2xl quot 1x1x2: ; quotxl P10 xlquot:; 1= Pll 0:ab; P6 :c; P4 bcalbc: ;ab P10 ab: ; S= Pi 2 0:quotxlx2; P4 x2xlquot 1x1x2 :; quotxl P10 xlquot:; 1= PI 2 0:abc; P4 bcalbc: ; ab P10 ab: ; Backtrack has taken place; stack popped S= int0:quot; inp:xl; inp:x2; P4 x2xl quot 1x1x2 :; quotxl outquot:; outxl : ; 1= int0:a; inp:b; inp:c; P4 bcalbc: ; ab outa:; outb:; Entering SELECTOR S = i nt : quot ; inp:'x2; inp:xl; P 4 x 2 x 1 quot 1x1x2: ; quot x 1 out qu ot : ; outxl : ; 1= int0:a; inp:b; inp:c; P4 bcalbc: ; ab outa:; outb:; Fig. A- 8. Continuation in the steps in the deduction scheme for flowchart in Fig. A-5. 201 5= int0:quot; P5 :x2; inp:xl; P4 x2xl quot 1x1x2: ; quotxl outquot : ; outxl : ; I- int0:a; P5 :b; inp:c; P4 bcalbc:;ab outa: ; outb: ; S= int0:quot; P5 :x2; P6 : x 1 ; P4 x2xlquot 1 xl x2 : ; quotxl outquot:: outxl : ; 1= into: a; P5 :b; P6 :c; P4 bcalbc: ;ab outa:; outb:; S- P7 fl:-:mot; P5 :x2; P6 :xl; P4 x2xlquot lxl x_2: ; quotxl outquot:; cotxi:; 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc: jab outa:; outb:; S- P7 0:quot; P5 :x2; P6 :xl; P4 x2xlquot 1x1x2: ; quotxl P8 quot : ; outxl : ; 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc: ; ab PS a: ; outb:; S= P7 0:quot; P5 :x2; P6 :xl; P4 x2x 1 quot 1x1x2: ; quotxl P8 quot:; P9 xl: ; 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc: ; ab P8 a: ; P9. b: ; Entering REDUCOR S= P7 : quot ; P5 : x2 ; P6 : x 1 ; P4 x2x 1 quot 1x1x2:; quotx 1 P8 quot : ; P9 xl: ; 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc: ; ab P8 a:; P9 b: ; S= P7 : quot ; P5 : x2 ; P6 : x 1 ; P4 x2x 1 quot 1x1x2:; quotx 1 Pig quotx 111 1= P7 0:a; P5 :b; -P6 :c; P4 bcalbc: ; ab P10 ab: ; S« Pll 0:quotx2; P6 : x 1 ; P4 x2x 1 quot 1 x 1 x2 : ; quotx 1 P 1 quot x 1 : ; 1= Pll 0:ab: P6 :c; P4 bcalbc: ;ab P10 ab: ; S= Pi 2 0:quotx2xl; P4 x2x 1 quot 1x1x2 :; quotxl P10 quotxl:; 1= Pi 2 0:abc; P4 bcalbc: ; ab P10 ab: ; Backtrack has taken place; stack pooped S= P7 0:quot; P5 :>:2; P6 :xl; P4 x2x 1 quot 1x1x2: ; quotxl outquot:; outxl : ; Fig. A-9. Continuation in the steps in the deduction scheme for flowchart in Fig. A- 5. 202 I- P7 0:a; P5 :b; P6 :c; P4 bcalbc: ;ab out a: ; outb: ; . Entering SELECTOR S= P7 0:quot; P5 :x2; P6 :xl; P4 x2xlquot 1x1x2: ; quotxl outxl : ; o utquot : ; I* P7 0:a; P5 :b; P6 :c; P4 bcalbc: ;ab outa: ; outb:; S= P7 0:quot; P5 :x2; P6 :xl; P4 x2xlquot 1x1x2: ; quotxl P8 xl : ; o utquot: ; 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc:;ab P8 a: ; outb:; S= P7 0:quot; P5 ?x2; P6 :xl; P4 x2xlquot 1x1x2: ; quotxl P8 xl : ; P 9 quot : ; 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc:;ab P8 a:; P9 b: ; Entering REDUCOR S= P7 0:quot; P5 :x2; P6 :xl; P4 x2xl quot 1x1x2 :; quotxl P8 xl:; P 9 quot : ; 1= P7 B:a; P5 :b; -P6 :c; P4 bcalbc:;ab P8 a:; P9 b: ; S= P7 0:quot; P5 :x2; P6 :xl; P4 x2xlquot 1x1x2: ; quotxl Pi. 9 xlquo t:; 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc: :ab P10 ab: : S= Pll 0;quotx2; P6 :xl; P4 x2xlquot 1x1x2 :; quotxl P10 xlquot:; 1= Pll 0:ab; P6 :c; P4 bcalbc: ; ab P10 ab: ; S= Pi 2 0-:quotx2xl-; P4 x.2xlquot 1x1x2: ; quotxl P10 xlquot:; 1= Pi 2 ff:abc; P4 bcalbc:'; ab Pl0 ab: ; Your flowchart is NOT correct for algorithm intdivide. Press -NEXT- for a 1 ist of POSSIBLE errors Fig. A-10. Continuation in the steps in the deduction scheme for flowchart in Fig. A-5. 203 student -dan Rigor ithm- intdi vide f start ) 1 * quot*0 - r read xl , , ■< r read x2 1 qu.otx outb: ; )• * [x 1 ix outb: ; ) * [xlix outb: ; Entering REDUCOR S= int0:quot; inp:xl; inp:x2; ( P2 xlx2:xl; Pi quot 1 : quot ;)* [x 1 £x 2] outquot: ; outxl : ; 1= int0:a; inp:b; inp:c; ( Pi al:a;' P2 bc:b; ) * [b2:c] outa:; outb:; Entering TRANSLATOR B Entering REDUCOR after interchange in TRANSLATOR B S= i ntJ0 : quot ; i np : x 1 ; i np: x2 ; ( P 1 quot 1 : quot ; P_2 x 1 x2 : x 1 ; ) * [x 1 ix 2] outquot: ; outxl : ; 1= inttf:a; inp:b; inp:c;( Pi al:a; P2 bc:b; ) * [b£c] outa:; outb:; S= i nt : quot ; l np : x 1 ; i np : x2 ; ( P 3 quot 1 x 1 x 2 : qu o t x 1 ; ) * [xl £x2] out quot : ; outxl : ; 1= int0:a; inp:b; inp:c; ( F'3 albc: ab; ) * [bic] outa:; outb:; Entering SELECTOR S= int0:quot; inp:xl; ihp:x2; F'4 xlx2quot 1x1x2 : : outxl : ; 1= int0:a; inp:b; inp:c; P-4 bcalbc:;ab outa:; outb:; >-■.- 1 outquot : ; S = i nt : quot ; P5 : xl ; i np : x 2 ; P 4 x 1 x 2 quot 1x1x2: ; quot x 1 out quot : ; outxl : ; 1= int0:a; P5 :b; inp:c; P-4 bcaibc:;ab outa:; outb:; Fig. A-12. Steps in the deduction scheme for the flowchart in Fig. A-ll. 205 5= int0:quot; P5 :xl; P6 :'x2; P4 xlx2quot 1x1x2 : ; quotxl outquot : ; outxl : ; X= mt.0:a; P5 :b; P6 :c; P4 bcalbc:;ab outa:; outb: ; S* P7 g:quot; P5 :xl; P6 :x2; P4 xlx2quot lxjx2 : ; quotxl outquot:; outxl : ; 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc:;ab outa:; outb:; S= P7 0:quot; P5 :xl; P6 :x2; P4 xlx2quot 1x1x2: ; quotxl P8 quot : ; outxl : ; 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc:;ab P8 a: ; outb:; S= P7 0:quot; P5 :«xl ; P6 :x2; P4 xlx2quot 1x1x2 :; quotxl PS quot:; P9 xl : ; 1= P7 @:a; P5 :b; P6 :c; P4 bcalbc:;ab P8 a:; P9 b: ; Entering REDUCOR S= P7 0:quot; P5 :xl; P6 : x2 ; P4 xlx2quot 1x1x2 :; quotxl P8 quot:; P9 xl : ; 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc:;ab PS a:; P9 b: ; S= P7 0:quot; P5 :xl; P6 :x2; P4 xlx2quot 1x1x2: ; quotxl Pig quotv l:; 1= P7 0:a; P5 :b; P6 :c; P4 bcalbc: ; ab Pig ab: : S= Pll g: quotxl; P6 :x2; P4 xlx2quot 1x1x2: ; quotxl P10 quotxl:; 1= Pll g;ab; P6 :c; P4 bcalbc: ; ab P10 ab: ; 5= Pi 2 : quot x 1x2; P4 xlx2quot 1x1x2 :; quotxl Pig quotxl:; 1= P12 0:abc; P4 bcalbc: ; ab Pig ab: ; S= Pi 3 01:quotxlx2; P10 quotxl:; 1= P 1 3 01: abc ; Pig ab: ; S= Pi 4 01 : quot x 1x2; I* Pi 4 01 :abc; Your 1 1 owe ha r t is correct for a 1 g or i t hm i nt d l v i de . Press -NEXT- to do another algorithm. Fig. A-13. Continuation of the steps in the deduction scheme for flowchart in Fig. A-11. 206 VITA Daniel Clair Hyde was born in LeRoy, New York on March 18, 1946. He received his Bachelor of Science degree cum laude in Electrical Engineering from Northeastern University, Boston, Massachusetts in 1969. As a participant in the cooperative work program at Northeastern University, he was employed at Taylor In- strument Companies, Rochester, New York, and the Office of Academic Research, Northeastern University. He joined the University of Illinois at Urbana-Champaign as a graduate teaching assistant in the Department of Computer Science in 1969. While a teaching assistant, he taught courses in introductory programming, introduction to theory of digital machines, design of digital switching circuits, and computer aided instruction. In 1971, he was awarded an NDEA Title IV Fellowship for two years. He conducted research in the areas of computer-based education, artificial intelligence, and software engineering under the guidance of Professor Sylvian R. Ray. From 1973 to 1975 he was employed by the Computer-based Education Research Laboratory to conduct re- search on the PLATO IV Project in the area of military training. He is the coauthor with Professor Bruce L. Hicks of a paper en- titled, "Teaching about CAI," which was published in Journal of Teacher Education , Summer of 1973. He has been honored by member- ship in Eta Kappa Nu and Tau Beta Pi and by associate membership in Sigma Xi honor societies. He is a member of Institute of Electrical and Electronics Engineers and Association for Computing Machinery. BIBLIOGRAPHIC DATA SHEET d Report No. uiucdcs-R-75-7^3 3. Rei ipiCflt I A( ( r || inn N< 5^ Report Date June, 1975 4. lit lc and Subt it It- ANALYZING SMOOTH FLOWCHARTS- TEACHING STRUCTURED PROGRAMMING IN A COMPUTER -PASED EDUCATION ENVIRONMENT 7. Author Is) Daniel Clair Hyde 8- Performing Organization Repc. No. 9. Performing Organization Name and Address Department of Computer Science University of Illinois at Urbana -Champaign Urbana, Illinois 6l820 10. l'ro|