LIBRARY OF THE 
 
 UNIVERSITY OF ILLINOIS 
 
 AT URBANA-CHAMPAIGN 
 
 51C84 
 
 liter 
 
 no.S60-835 
 cu>p. 2/ 
 
 ; >;-> ^ 
 *■&**:. 
 
I 
 
 r. 
 
 ;: 
 
 [J 
 
 \ 
 
 5 
 
 I 
 
 I 
 
- 
 
 Report No. UIUCDCS-R-76-832 
 
 7k 
 
 ' a. t v\ 
 
 t 
 
 AN AUTOMATIC VERIFIER FOR A CLASS OF SORTING PROGRAMS 
 
 by 
 PRABHAKER MATETI 
 
 October 1976 
 
 DEPARTMENT OF COMPUTER SCIENCE 
 UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN 
 
 URBANA, ILLINOIS 
 
 The librwy of Uw> 
 
 JAN 20 1977 
 
 University of Illinois 
 «t Urtena -Chair gn 
 
-■'•• 
 
 r 
 
 inn' 
 
 •■C| 
 i. 
 
 ■ 
 

 ■■■■■i 
 
 Report No. UIUCDCS-R-76-832 
 
 AN AUTOMATIC VERIFIER FOR A CLASS OF SORTING PROGRAMS 
 
 by 
 
 PRABHAKER MATETI 
 
 October 1976 
 
 Department of Computer Science 
 
 University of Illinois at Urbana-Champaign 
 
 Urbana, Illinois 61801 
 
 
 
 i 
 3 
 
 
 
 This work was supported in part by the National Science Foundation under 
 Grant No. NSF EC 41511 and was submitted in partial fulfillment of the 
 requirements for the degree of Doctor of Philosophy in Computer Science, 
 October 1976. 
 
•oto 
 
 
 jr" 
 
 •fi 
 
 e 
 
 .. « 
 
 ! 
 
1 11 
 
 ACKNOWLEDGMENTS 
 
 I am indebted to Jurg Nievergelt, my thesis advisor, for his 
 encouragement, interest, help, advice and patience. 
 
 I am grateful to Jayadev Misra who had significantly influenced 
 my thinking about programming; to Dave Plaisted for his help during the 
 development of the theorem prover; to Dave Eland who always helped me 
 out of TUTOR troubles; and to Ron Daniel son without whose critical read- 
 ing this thesis would have been even poorer in style. 
 
 I am thankful to Bert Speel penning for a number of useful 
 discussions, and to Wilfred Hansen for his encouragement and criticism. 
 
 i 
 
*<l! 
 
 c 
 
 X 
 
 . IB 
 
 n> 
 
 c 
 2 
 
 ■ 
 
Table of Contents 
 
 IV 
 
 INTRODUCTION 
 
 1.1 Automatic Program Verification 
 
 1.2 Limited Domain Program Verifiers . 
 
 1.3 Program Verification in Teaching Programming 
 
 1.4 The Sorting Program Verifier 
 
 VERIFIER 
 
 2.1 Inductive Assertion Method of Verification . 
 
 2.2 The Programming and Assertion Languages . 
 
 2.3 Verification Condition Generator . . . . 
 
 Page 
 
 1 
 
 1 
 3 
 4 
 6 
 
 7 
 
 9 
 
 15 
 
 THEOREM PROVER 24 
 
 3.1 Basic Theorem P rover 
 
 3.2 Basic Theorem Prover is a Decision Procedure 
 
 3.3 Evaluation of Backward Functions 
 
 3.4 Extended Theorem Prover is a Decision Procedure 
 
 3.5 Counterexample Generation 
 
 GENERALITY 
 
 4.1 Constraints of the Present Verification System. 
 
 4.2 Partitioning 
 
 4.3 Closure and Local Implication 
 
 4.4 Examples 
 
 4.5 On the Applicability of Partitioning 
 
 SORTLAB 88 
 
 5.1 PLATO 88 
 
 5.2 ACSES 90 
 
 5.3 S0RTLAB--A Programming Laboratory 90 
 
 25 
 
 
 44 
 
 
 57 
 
 64 
 
 
 
 » 
 
 68 
 
 
 
 i 
 
 
 
 
 
 70 
 
 
 
 
 
 
 70 
 
 1 
 
 
 
 71 
 
 
 72 
 
 
 73 
 
 
 86 
 
 
"1 
 
6Z 
 
 Table of Contents (Continued) 
 
 Page 
 
 6. DISCUSSION 
 
 6.1 A Critique of Program Verifiers 
 
 6.2 Previous Work Related to This Thesis .... 
 
 6.3 Salient Features of the Sorting Program Verifier 
 
 6.4 Conclusion 
 
 REFERENCES 
 
 APPENDIX 
 
 VITA . . 
 
 100 
 
 100 
 107 
 109 
 112 
 
 114 
 118 
 
 119 
 
 i 
 
 
 
M< 
 
1 . INTRODUCTION 
 
 This thesis discusses primarily the theoretical basis of a 
 verifier for sorting programs designed for use in an automatic tutor 
 for computer programming. A system, called SORTLAB, has been built em- 
 bedding the sorting program verifier. SORTLAB allows a student to write 
 programs for sorting an array, and decides whether these programs are 
 correct; if they are not, it generates counterexamples. SORTLAB has 
 been implemented on the PLATO system for computer-aided instruction 
 [Alpert and Bitzer 1970], as a part of an Automated Computer Science 
 Education System, ACSES [Nievergelt 1975]. 
 
 The development of SORTLAB required several different components: 
 in particular: 
 
 - a programming language convenient enough to write programs 
 for sorting an array and not necessarily other programs, 
 
 - an assertion language with just the required expressive power 
 to assert the state of an array with respect to the order 
 
 of its elements, 
 
 - special purpose techniques for verifying these programs 
 with assertions, including a theorem prover for the class of 
 lemmas generated by the verifier. 
 
 i 
 
 
 ■i.i«* 
 
 1 .1 Automatic Program Verification 
 
 With the increased concern for program reliability, the verifica- 
 tion of programs is receiving greater attention than ever before. The 
 verification process consists of checking if the program meets its 
 
H 
 
 
 m 
 
 c 
 
 specifications, namely, that it always terminates, and when it does, 
 certain variables have a desired property provided the input given to 
 the program meets the input specification. 
 
 The inductive assertion method of proving programs considers 
 these two problems separately: That the program meets its input/output 
 specifications is proven separately from that of proving termination. 
 The method also requires that an invariant property about the program 
 variables be given for e^jery loop. Given these specifications and loop 
 assertions, a set of mathematical lemmas are generated, which depend on 
 the assertions given, and the semantics of the programming language 
 used. If these lemmas are true, the correctness of the program is 
 guaranteed; thus proving that a program meets its specification is 
 equivalent to proving a certain set of lemmas. This is the crux of the 
 problem. 
 
 Much of the verification process is of a yery mechanical 
 nature, and unless a large part of the process is carried out by the 
 computer, few programmers would be willing to hand- verify their programs. 
 A number of program verifiers have been constructed (see survey by 
 London [1972]), requiring varying degrees of human intervention. However, 
 these are far from being helpful to a programmer for several reasons. 
 Typically, human aid is required in pruning the proof-trees. An ordinary 
 programmer is not trained in theorem-proving, and is usually not interested 
 in how these lemmas are proved. In addition, if the program is incorrect, 
 the verifiers cannot provide assistance either by generating a counter- 
 example, or by pointing out where the error lies. Finally, the verifiers 
 are slow in operation, even for small programs. These conditions 
 
combine to tempt a programmer test-run his programs rather than submit 
 them to an automatic verifier! 
 
 1 .2 Limited Domain Program Verifiers 
 
 The failure in constructing verifiers that are mechanical aids 
 to program writing can be attributed largely to the ambitious approach 
 taken in building these verifiers. Except for the earliest of the veri- 
 fiers [King 1969], the others have been increasingly ambitious in the 
 variety of programs they intended to verify. The wide scope of programs 
 being proven requires that the programs be written using elementary but 
 powerful operations. Further, the assertions, and hence, the lemmas 
 generated have to be formulated in first-order predicate calculus (or 
 the equivalent), which is theoretically undecidable. By increasing the 
 power of the theorem provers, we not only make them nondecision procedures, 
 but they also lose a sense of direction toward their goal. A large number 
 of useless inferences are then generated. Even among the decidable 
 domains of problems, the theorem provers must be carefully designed in 
 order to yield a decision procedure that works in practice. A "good" 
 theorem prover should prove a large class of theorems that are often 
 encountered very quickly, while it may take a while to decide about 
 others. 
 
 A verifier and its theorem prover can become simple, if they 
 incorporate certain aspects of the semantics of the problem domain. For 
 example, most well -written nonnumeric programs manipulate their data 
 structures in a "disciplined and uniform" way, which is as yet not 
 formally characterizable. The verification lemmas arising from such 
 
 
 i 
 
 
 
.JT' 
 
 if; 
 
 ■ 
 
 c 
 
 
 r 
 
 programs seem to be of a different nature from those that may arise in 
 ordinary mathematics, say, number theory. If this is indeed the case, 
 the underlying formal system may be decidable. If so, an incorrect pro- 
 gram may be proven to be incorrect, counterexample generation may be 
 feasible, and fast theorem-proving procedures may exist for the specific 
 class of lemmas. 
 
 Strictly speaking, e\/ery program verifier constructed so far 
 is a limited domain verifier. For example, the programs being verified 
 are often limited to those that operate on integer-valued variables. But 
 we mean to limit the domain even further. Some examples of such domains 
 are programs operating on linear arrays with no arithmetic, those using 
 lists, binary trees, etc. 
 
 It is doubtful if it would ever be possible to construct suc- 
 cessful general purpose program verifiers. On the other hand, practical 
 verifiers dealing with programs from a limited domain of discourse can be 
 designed. This thesis provides one such example, namely, a verifier for 
 in-place sorting programs which is being used in an automatic tutor of 
 computer programming. 
 
 1 .3 Program Verification in Teaching Programming 
 
 It is important that a student programmer realize the need for 
 program reliability. A concern for the correctness of programs at an 
 early stage in one's education has great impact on one's attitudes to- 
 ward programming in later years. As exemplified by Dijkstra and others, 
 a systematic method of designing abstract programs depends heavily on 
 
the correctness proofs of programs. The "elegance" of a program is 
 usually directly proportional to the ease with which it can be proven 
 correct. There can be no question that one's understanding of one's own 
 program is increased greatly after inventing the loop assertions for the 
 program. Quite often one discovers better ways of writing the program. 
 
 In teaching programming, one would like to supervise the pro- 
 gram design process by the student, as well as examine thoroughly the 
 finished product. Both these aspects are amenable to computerization, 
 particularly if an interactive computer system is available. The teacher- 
 program supervising the program design process should be an expert in 
 the programming problem domain, must have an "opinion" about various de- 
 sign methodologies, and, perhaps more importantly, be able to converse 
 with the student in a reasonable language. If the problem domain is 
 sufficiently simple, such teaching programs can indeed be designed. For 
 an example of such a system, see [Daniel son 1975]. 
 
 On the other hand, a teacher program examining the student's 
 finished program will not, and should not, consider the design process. 
 Regardless of how it was constructed, judging the program's correctness 
 and elegance should be its concern. The teacher program may, at one 
 extreme, simply test run the student program, or, at the other extreme, 
 attempt to formally verify the student program. Such teacher program 
 should contain at least a program editor, a run-time system, a program 
 verifier, and a counterexample generator. 
 
 Apart from these technical qualifications required of the 
 teacher programs, they should be fast enough to give interactive response 
 
 
 
to the student. These considerations lead us to write a special ized teacher 
 program with built-in knowledge of a programming domain resulting in an 
 interactive programming laboratory, SORTLAB, wherein the student can 
 prepare a sorting program, and use the program verifier iteratively until 
 a correct program is obtained. 
 
 1 .4 The Sorting Program Verifier 
 
 1 
 v. 
 
 • 
 
 A 
 
 We have chosen in-place sorting as the limited domain of dis- 
 course in SORTLAB because of two main reasons. First, every program 
 verifier constructed so far has verified several sorting programs; their 
 authors quote, quite often exclusively, these examples. This gives us 
 a basis for comparison. Secondly, sorting programs are perhaps the most 
 used examples in introductory programming courses. 
 
 The verifier can actually prove any program., sorting or not, 
 written in our mini-programming language and whose behavior can be 
 asserted in the assertion language. (See Sections 2. 2 and 3.1 for a des- 
 cription of these languages.) If the program is not proven correct, 
 then there must be mathematical "lemma"(s) generated from the program 
 and its assertions which are false. The verifier can generate counter- 
 examples to these lemmas. 
 
m 
 
 2. VERIFIER 
 
 Every program operates on a certain set of data objects and 
 aims to produce an output set of data objects with desired properties. 
 A subset of these data objects, the input, is given to the program, and 
 the remaining data objects are the result of program execution. Quite 
 often, the input changes in its structure, data objects get created or 
 destroyed, their structure and relationships change D The program is 
 expected to realize a desired property on the output only if the input 
 meets certain requirements. To this end, the programmer asserts what 
 relationships are to hold on the input data objects, and what holds on 
 the output. 
 
 2.1 Inductive Assertion Method of Verification 
 
 Given the input and output assertions, say <J> and iJj, we are in- 
 terested in verifying that the program P behaves properly, i.e., whenever P 
 is given input satisfying <j>, the output satisfies \p, if and when P terminates. 
 Notationally, following [Manna and Pnueli 1974], let us express this statement by: 
 
 {cj> P ^} 
 
 (2.1) 
 
 The program P is said to be partially correct with respect to <p and \p if 
 (2.1) is true. (Occasionally we refer to $ and \p as the entry and exit 
 assertions of P when P is a program segment.) P is said to be totally 
 correct , if in addition to (2.1) being true, P always terminates. In 
 this thesis we will be dealing with partial correctness only, and 
 
 
 3 
 
 

 £ 
 
 • • 
 
 1 * ' 
 
 e i 
 
 3 :1 
 * - 
 3 -' 
 
 i ":: 
 
 
 hence forth refer to this simply as correctness. We shall use "prove 
 a program segment" as an abbreviation of "prove a program segment correct 
 with respect to its entry and exit assertions." 
 
 The proof of (2.1) is trivial, in theory, if the set of data 
 objects satisfying the entry assertion $ is finite, since the program 
 can be checked separately on each of these data objects. However, in 
 general this set is infinite, and even if it is finite, it is usually 
 such a large set that separate treatment of each data object is not 
 practical. 
 
 One of the most widely used verification techniques, the induc- 
 tive assertion method [Floyd 1967, Naur 1966], divides the verification 
 process into two phases. First, a set of mathematical lemmas ("verifi- 
 cation conditions") is generated, which, if proven, is sufficient to im- 
 ply the correctness of the program. The second phase is the proof of the 
 lemmas thus generated. 
 
 The generation of the verification conditions is intimately 
 linked to the semantics of the programming language being used, and to 
 the structure of the program at hand. The program being proven is decom- 
 posed into loop-free segments such that each segment has an entry asser- 
 tion and an exit assertion. Any computation performed by the program is 
 then a concatenation of executions of some selected segments. Thus, if 
 each segment is correct with respect to its entry and exit assertions, using 
 induction on the number of loop iterations, it can be proved that ewery 
 computation with input satisfying the input assertion of the program 
 yields output satisfying the output assertion when the program terminates. 
 

 Once the validity of this inductive proof is established, to prove a 
 given program, only the decomposed loop-free segments need be proven. 
 Requiring each loop to include an invariant assertion guarantees the 
 decomposition of the program into loop-free segments each with entry 
 and exit assertions. 
 
 Section 2.3 describes the generation of these lemmas. The for- 
 ward substitution method is touched upon only briefly as we shall be 
 using the backward substitution methodo The programming and assertion 
 languages used are described informally in Section 2.2; their formal 
 specification appears in Chapter 5. Figures 2.1 and 5.2 give examples of 
 programs written in our languages A decision procedure for the lemmas 
 is presented in Chapter 3. 
 
 2.2 The Programming and Assertion Languages 
 
 In the interest of developing a fast and small verifier, we 
 limit the domain of programs. But the variety of programs cannot be 
 limited by a programming language alone. As is well known [McCarthy 
 1960], a programming language rich enough to include a successor function, 
 a conditional, and recursion is universal in the sense that any recursive 
 function can be programmed in this language. A programming language, by 
 imposing constraints and providing certain kinds of primitive operations 
 while eliminating others, can only make it wery inconvenient, but not 
 impossible, to write certain programs. 
 
 On the other hand, an appropriately chosen assertion language 
 can limit the kind of programs that can be asserted in that language. A 
 
 
 
 i!iW» 
 
10 
 
 am ' 
 
 t 
 
 
 c 
 
 1 procedure sort (n) 
 
 * true 
 
 2 scan down with i from n to 2 
 
 3 scan up with j from 1 to i-1 
 
 4 if xj > xj+1 then 
 
 5 exchange xj with xj+1 
 
 6 else 
 
 7 end if 
 
 * 1 < J < I * N & A(1;J) £ XJ+1 & A(1;I) < S(I+1;N) 
 
 8 endscan 
 
 * 1 < I < N & A(1;I-1) < S(I;N) 
 
 9 endscan 
 
 * S(1;N) 
 
 10 endproc 
 
 
 Abbreviations 
 
 A( for 
 
 S( for 
 
 array (s,t) < sorted (u,v) for 
 
 array ,( 
 
 sorted ( 
 
 array (s,t) < array_(u,v) and 
 
 sorted (u,v) 
 
 Figure 2.1 A Bubble Sort Program with Assertions 
 
11 
 
 more general assertion language will use elementary, but powerful, 
 atomic predicates. This use of elementary predicates makes it diffi- 
 cult to lump together all related predicates. The loss of power of ex- 
 pression in a limited assertion language is compensated for by the large 
 and recognizable chunks of properties in the assertion. Further, while 
 it has been advocated that theorem provers make large inferences > it 
 appears necessary that related information should be recognizable as 
 such before large inferences can be made. 
 
 These considerations led us to design a mini-programming lan- 
 guage and an assertion language which are specific to the sorting of 
 arrays. Formal specification of the language is given in Chapter 5. 
 Below we touch upon only the salient features. 
 
 2.2.1 Programming Language 
 
 Operations on Keys 
 
 It is a well-recognized principle in program design that basic 
 procedures, specific to the particular problem and the data structures 
 being used, should be developed and used so that data integrity may be 
 preserved [Dahl et al . 1972], Sorting programs must conserve the keys 
 they are sorting. Hence, we provide two basic operations: exchange and 
 insertion of keys, and forbid value assignments to the keys of the array. 
 This guarantees that the elements of the array are conserved throughout 
 the program. Therefore, our verifier need only prove that the array is 
 sorted. 
 
 i MO 
 
 CHUB 
 
12 
 
 Operations on Array Indices 
 
 Successor and predecessor functions on the indices ("ptrs") 
 of the array provide sequential access to the elements. A ptr variable 
 may be assigned the value of a ptr expression , which is of the form 
 <ptr variable> + <integer constants 
 
 i 
 
 ;i° 
 
 ' 
 
 !! 
 
 c 
 
 Mi! 
 
 Procedures 
 
 In our verification system, we assume that the "intention" of 
 any procedure is to produce a certain permutation of the elements of the 
 array x, which is global to all procedures. Most procedures, however, 
 permute only the elements belonging to a certain contiguous segment, say 
 array (a,b); thus, we require that each procedure have exactly two input 
 (formal) ptr variables a, b. All elements not belonging to array (a,b) 
 are made "read only" to this procedure; no such element is permitted to 
 participate in any exchange or insert operation. "Guard expressions" 
 are provided for this (see also [Marmier 1975], p. 57). 
 
 For simplicity, we insist that the formal ptr variables a, b be 
 not subject to assignment (i.e., they may not appear on the left hand 
 side of any assignment). The two variables must, of course, be distinct., 
 Optionally, a procedure may return ptr results to the calling procedure. 
 There are no global ptr variables. An entry and an exit assertion for 
 the body of the procedure must be given. 
 
 Procedure Calls 
 
 A procedure call must contain two ptr expressions as the actual 
 
13 
 
 input parameters for the procedure called. If the called procedure has 
 output parameters, the call must receive these results in distinct 
 ptr variables. We further insist that these variables be distinct 
 from those appearing in the input ptr expressions; this is done for 
 the sake of simplicity. 
 
 For each call statement, we require that an entry assertion 
 to the call be given . Thus, the user must give not only the loop in- 
 variants, entry and exit assertions for procedures, but also an entry 
 assertion for each call (see Section 6.1.2 for a related discussion). 
 
 Control Structures 
 
 In addition to the familiar vf and while statements, a scan 
 statement (similar to the for statement of other languages) is provided. 
 The loop variable of scan , however, is not considered "unmodifiable" by 
 its body. 
 
 2.2.2 Assertion Language 
 
 i 
 
 MO 
 
 Array Predicates 
 
 When sorting arrays with sequential access, we generally need 
 atomic predicates to indicate that: 
 
 1. The array segment from index s to index t is sorted 
 = if s < i < j < t then x- < x., and 
 
■ {« 
 
 '■ 
 
 ' 
 
 * 
 
 14 
 
 2. Elements of the segment from s to 5 are all less than any 
 element of the segment from u to v 
 = if s < i S t and u £ j < v then x- < x. 
 
 where x is the name of the array, and an index is a ptr expression. 
 These array predicates are abbreviated as: 
 
 sorted (s,t) 
 and 
 
 array (s,t) < array (u,v) 
 respectively. The segments are defined by their lower boundaries s, u 
 and the upper boundaries t, v. 
 
 Ptr Predicates 
 
 Predicates relating the indices of the array will also be 
 
 needed: 
 
 ptr i is at least c units below j = i + c < j 
 
 where i, j are ptr variables and c an integer constant, and i + c is a 
 ptr expression. 
 
 Assertions 
 
 An assertion, then, is a sentence formed of these basic predi 
 cates and the logical connectives and and or. Notice the absence of 
 negation in this language which makes it impossible to assert that an 
 array is NOT sorted. However, ptr predicates, e.g., i + c < j, can be 
 negated as j + (1-c) < i. Henceforth, the assertions will be written 
 
15 
 
 informally; e.g., i + c > j rather than j + 1 - c < i or i = j rather 
 than 1 + < j and j + < i . 
 
 2.3 Verification Condition Generator 
 
 We first discuss the generation of verification conditions of 
 a simple loop program segment W with a loop-free body S. 
 
 W: while B do 
 
 endwhile 
 
 (2.2) 
 
 This is then generalized to cover arbitrary procedures. Two general 
 methods for the generation of verification conditions are forward sub- 
 stitution and backward substitution [King 1969]. 
 2.3.1 Forward Substitution 
 
 Let <j> w be the entry assertion of W, and <(>,. be the entry asser- 
 tion of S. We then symbolically execute S on <j> s to obtain an assertion 
 Sf(<J>c). Then the exit assertion of W is generated as 
 
 VIM 
 
 Eg 
 
 <J> W and not B or_ Sf (<j>_) and not B 
 
 (2.3) 
 
 The lemmas to be proven are 
 
 <f>u and B logically impl ies 4>~ 
 
 Sf (<j>-) and B logically implies cj><~ 
 
 (2.4) 
 (2.5) 
 
I 
 An 
 
 
 ■ i 
 
 S: 
 
 3 
 
 
 
 16 
 
 Proving (2.4) and (2.5) guarantees that the entry assertion <j>~ of S 
 will be true each time S is entered. 
 
 The assertion Sf(<(> s ) can be obtained by forward substitution as 
 follows: If S is empty then Sf(<t> s )is the same as <(>-. Otherwise, let 
 S be a concatenation of SI and S2, where S2 may be empty. Then we obtain 
 Sf(<J> s ) by recursively applying rules Fl and F2 defined as follows: 
 Rule Fl (applicable iff SI is an assignment statement) 
 
 Let SI be u +■ t where t is an expression 
 
 then Sf (cf> s ) is 
 
 S2f ( subst u for u in <f><0 and u = ( subst u for u in t) 
 
 (2.6) 
 
 Rule F2 (applicable iff SI is an If statement) 
 Let SI be 
 
 if Bl 
 then S3 
 else S4 
 endif 
 
 Then Sf(<j> s ) is 
 
 S2f(S3f (W and Bl ) or S4f(W and not Bl)) 
 
 (2.7) 
 
 where subst y for z jn^ F stands for the expression obtained by substitu- 
 
 i 
 ting y for all occurrences of z in the expression F. The variable u 
 
 refers to the previous value (before SI) of the variable u; thus (2.6) 
 
17 
 
 asserts the existence of a value for u which satisfied <j>» prior to SI 
 which is to be used in the expression t. This introduction of existen- 
 tial quantifiers causes certain technical difficulties to our theorem 
 prover (see Chapter 3). Hence we have chosen to abandon forward substi- 
 tution, even though it seems appealing due to its close association with 
 ordinary execution of programs, and to adopt the backward substitution 
 method, which does not introduce any quantifiers. 
 
 2.3.2 Backward Substitution 
 
 Without loss of generality, let the given loop invariant be the 
 
 exit assertion of the loop body S. Given the exit assertion \p~ of S, 
 
 we generate an entry assertion <j>_ such that {<j>~ S ^<.}. It should be 
 
 noted that several such assertions <j> s exist, one of them being the trivial 
 
 false . However, the <j>- generated by backward substitution is such that, 
 
 for any <j> , if {<)> S ^ s > then c}> logically implies <(><.. 
 
 Now let ty M be the exit assertion of W, and \p .. be the exit as- 
 w o 
 
 sertion of the loop body S. We can symbolically "unwind" the execution 
 in the backward direction and obtain the entry assertion of S as <j)~ = 
 Sb(^ s ). Then the entry assertion of W is 
 
 
 i hi* 
 
 ^u - ^s and B PJ2 ^u and not B 
 
 (2.8) 
 
 Proving the lemmas 
 
 and 
 
 ^ and B l ogically implies Sb(ip^) 
 
 ^S and not B logically implies & 
 
 (2.9) 
 
 (2.10) 
 
18 
 
 •.Ik " 
 
 • 
 
 ft.ii 
 
 ■an 
 it 
 
 guarantees that the entry assertion $~ = Sb(xp s ) of S will be true each 
 time S is entered, and that the exit assertion ih, of W holds, when the 
 while -loop is exited. 
 
 The assertion Sb(\^ s ) is obtained by backward substitution as 
 follows: Let S be a concatenation of SI and S2, S = SI; S2 where S2 may 
 be empty c We consider two cases. 
 
 SI is not a call statement 
 
 We recursively apply rules Bl and B2 to obtain Sb(ip s ) . 
 
 Rule Bl (applicable iff SI is either a ptr-assignment, exchange, or insert 
 statement) 
 
 Bl.l: Let SI be u «- t where t is a ptr expression 
 
 Then <J> S is 
 
 Sb(ij; s ) = subst t for u in_ S2b(\p s ) 
 
 (2.11) 
 
 £ i 
 t 
 
 B1.2: Let SI be exchange x with x b where a, b are ptr ex- 
 pressions. Then, 
 
 Sb(iJv) = exchb x 3 with x b in S2b(ifr s ) (2.12) 
 
 B1.3: Let SI be insert x below x, where a,b are ptr- 
 expressions. Then, 
 
 Sb(ify) = nsrtb x a below x b in S2b(^ s ) (2.13) 
 
 We postpone (to Chapter 3) an accurate description of these inverse func- 
 tions exchb . . ., and nsrtb . . . , as their evaluation plays an important 
 
19 
 
 role in our theorem proving. Intuitively, these functions reproduce the 
 situation(s) which must have existed prior to the exchange or insert. 
 
 Rule B2 (applicable iff SI is an vf- statement) 
 
 Let SI = If Bl then 
 
 S3 
 else 
 
 S4 
 endif 
 
 Then 
 
 Sb(i|/ S ) = S3b(S2b(^ s )) and Bl 
 
 or S4b(S2b(ip s ) ) and^ not Bl 
 
 SI is a call statement 
 
 Let the call statement and called procedure be as shown below: 
 
 * a 
 
 call Q(s,t) : (u,v) 
 
 * 3 
 
 procedure Q(a,b) : (c,d) 
 
 [procedure body] 
 Q 
 
 * ^ 
 
 endproc 
 
 where a and b are the two distinct input variables of procedure Q re- 
 ceiving values from the ptr expressions s and t respectively; the lists 
 given after ":" are the output parameters; a is the given entry assertion 
 to the call statement, and 3 is the generated exit assertion of call 
 
 .urn 
 
20 
 
 looking at the statements below call ; <j> , ip n are the given entry and exit 
 assertions for the procedure Q. 
 
 We should prove the following two lemmas: 
 
 a logically implies <j> 
 
 Q 
 
 (2.16) 
 
 ■ i : 
 f 
 
 •J* I 
 
 id' 
 
 t 
 
 if 
 
 S inn 
 
 3'! 
 ■ *f , 
 
 '€ 
 
 ■» '" 
 
 * 
 
 4 
 
 unmod 
 
 ified parts of a wrt (s,t) and \\j q logically implies 3 (2.17) 
 
 where <|> n> ip , unmodified parts of a are obtained from 4> n , ^ n , and a as 
 
 H Q WW 
 
 described below. 
 
 The entry assertion <J> n should not have any ptr variables other 
 than a or b because these are not defined on entry. We substitute in 
 cJ) Q the expressions s and t for a and b, resulting in (J)f. 
 
 The exit assertion i^ n should not have ptr variables other than 
 a and b or those contained in ptr expressions c and d. We substitute 
 in \pQ S s, t, u and v respectively for a, b, c and d to obtain ^ Q . Note 
 that the ptr variables u and v are substituted for expressions c and d. 
 Also note that if c or d contains either a or b then the ptr variables u 
 or v will be equal to an expression involving s or t. The substitution 
 of s and t for a and b is valid because the variables a and b are not 
 subject to assignment in procedure Q. 
 
 Recall that procedure Q can permute elements belonging to the 
 segment array (a,b). Thus, those predicates of a not including segments 
 which are strict subsegments of array (s,t) will still be true upon exit 
 from Q. Such predicates are collected together in unmodified parts of a 
 wrt (s,t). We postpone the description of this function to Section 3.3.2, 
 
21 
 
 When the program consists of more than one procedure, we must 
 prove the lemmas (2.16) and (2.17) for each procedure call, and further 
 prove that the called procedures meet their specifications. 
 
 To generate the entry assertion for the body of a given pro- 
 cedure, we begin at the bottommost and innermost loop and successively 
 generate the entry assertions of loops as described above. If <j) p is the 
 generated entry assertion of the procedure body P, and $ is the given 
 entry assertion, we should prove, in addition to the lemmas generated 
 for each loop as in (2.9) and (2.10), the lemma 
 
 (J) logically implies <j> p 
 
 (2.15) 
 
 Clearly, the proof of all these lemmas guarantees that 
 
 {<J> | P | ip} 
 
 An example of lemma generation appears in Figures 2.2 and 2.3. 
 
 (2.1) 
 
 ;5» 
 
 uw 
 
22 
 
 a?< ' 
 
 I 
 ■in 
 
 . . ■« 
 
 1 
 
 procedure sort (n) 
 
 
 * 
 
 TRUE 
 
 
 2 
 
 i «- n 
 
 
 3 
 
 while i >_ 2 do 
 
 
 4 
 
 j - 1 
 
 
 5 
 
 while j <_ i-1 do 
 
 
 6 
 
 if xj > xj+1 then 
 
 
 7 
 
 exchange xj with xj+1 
 
 
 8 
 
 else 
 
 
 9 
 
 en di f 
 
 
 * 
 
 1 1 J < I < N & A(1;J) <_XJ+1 & A(1;I) 
 
 < S(I+1;N) 
 
 10 
 
 endwhile 
 
 
 * 
 
 1 <I<NJ A(1;I-1) <_ S(I;N) 
 
 
 11 
 
 endwhile 
 
 
 * 
 
 S(1;N) 
 
 
 12 
 
 en dp roc 
 
 
 t 
 itii 
 
 til 
 
 ■3 
 
 The assertions 10* and 12* of this program are related to 7* and 8* 
 of Figure 2.1 as follows: 
 
 10* = subst j - 1 for j in 7* 
 
 12* = subst i+1 for i in 8* 
 
 Figure 2.2 Rubble Sort Program of Figure 2.1 Rewritten using while - 
 Statements Instead of scan s. 
 
23 
 
 The verification conditions for the program in Figure 2.2 are: 
 
 - for loop at 5: 
 
 10* and j < i logically implies stmts[6 . .10]b(10*) (VI) 
 
 10* and j > i logically implies subst i-1 for i vn_ 12* (V2) 
 
 where 
 
 stmts[6. . .I0]b(10*) e the generated entry assertion of body 
 
 of loop 5. 
 
 = X-; - x -_li and 9* or x. > x.,, and exchb x- with x.^ in 9* 
 J j+1 — j J + 1 j j+1 — 
 
 where 
 
 9* subst j+1 for j in_ 10* 
 
 - for loop at 3: 
 
 12* and i > 1 logically implies stmts[4. . .12]b(12*) (V3) 
 
 12* and i < 1 logically implies sorted (1,n) (V4) 
 
 where 
 
 stmts[4. . .12]b(12*) e subst 1 for j jm 
 
 ( subst i-1 for i jn_ 12*) and j > i 
 
 or stmts[6. . .10]b(10*) and j < i 
 
 - and for proc body: 
 
 true logically implies subst n for i in ( V5 ) 
 
 sorted Q ,n) and i < 1 
 or stmts[4. . .12]b(12*) and i > 1 
 
 1 
 
 Figure 2.3 The Five Verification Conditions Generated for Bubble Sort 
 
; 
 
 
 . 
 
 
 
 ' 
 
 
 *e 
 
 
 
 
 ■in 
 
 
 
 
 
 
 c 
 
 1 
 
 '■•» 
 
 iHl'l 
 
 1- 
 
 inn 
 
 * 
 
 
 
 
 j S 
 
 |M 
 
 * 
 
 
 e 
 
 ■ Mill 
 Itllli 
 
 2 
 
 ;S)n 
 
 is 
 
 : 
 
 ;swi 
 
 < 
 
 ■Mil 
 
 
 xi! 
 
 
 
 * 
 
 ■Ml 
 jlilil 
 
 in 
 
 ■ 
 
 c: 
 
 ■ 
 
 'MM 1 
 
 » 
 
 l kiiM i 
 
 * 
 
 luni 
 
 i 
 
 jfc)n i 
 41 
 
 * 
 4 
 
 II 1 : 
 
 24 
 
 3. THEOREM PROVER 
 
 In this chapter, we describe a procedure for proving or disprov- 
 ing a theorem whose premise and conclusion are augmented well -formed for- 
 mulas. Well -formed formulas (wffs) are the sentences of the assertion 
 language (see Chapter 5). Augmented wffs involve the functions subst, 
 exchb and nsrtb which map a pair consisting of a wf f and programming language 
 statement into a wff. The details of these mappings will be given in 
 Section 3.3. The proving or disproving of a theorem is done in two phases: 
 in the first phase, the augmented wffs are converted into wffs; in the second 
 phase, the actual proof begins. We discuss the second phase first, as the 
 evaluation of the above functions involves the concept of partitioning which 
 is an important part of the second phase. 
 
 The basic theorem prover is described in Section 3.1. A proof 
 that this basic theorem prover is a decision procedure for theorems stated as 
 wffs is given in Section 3.2. The theorem prover is then extended (in Sec- 
 tion 3.3) to include evaluation of the aforementioned functions. The chap- 
 ter concludes with Section 3.5 where the generation of counterexamples is 
 discussed. 
 
 The treatment will be informal. The level of rigor in the proofs 
 is comparable to that generally found when discussing combinatorial algo- 
 rithms. Several "remarks" are made soon after describing a procedure. These 
 are meta-lemmas describing the properties of the verification system. We 
 omit the proofs of these remarks as they are neither illuminating nor in- 
 teresting. 
 
25 
 
 3.1 Basic Theorem Prover 
 
 A theorem prover attempts to prove that a given conclusion oo fol- 
 lows from a certain hypothesis or premise n. If oo does not follow from n, 
 a general theorem prover may not always halt and say so. However, if Q, and 
 a) are sentences of a properly chosen assertion language, it is possible, as 
 we demonstrate in this chapter for our assertion language, to give a deci- 
 sion procedure for the question: Does Q. logically imply oj? We construct a 
 "most general model" for U, and then determine if oo is "true" in this model, 
 If oo is indeed satisfied by this model, then fi logically implies oj; other- 
 wise, we will be able to produce a counterexample which gives specific 
 values to the variables making oo false and n true. To make the discussion 
 more precise, we will need the following definitions. 
 
 3.1.1 Definition 
 
 Def 1 An interpretation of a wff is a mapping of the set of all ptrs, 
 constants and the elements of the array into the set of integers. 
 The ptr constants 0, 1, 2, . . ., and the function symbols +, -, 
 relation symbols <, <, >, >, =, =)= are given the usual meaning. 
 The key function x maps the ptr expressions into key elements 
 
 We have taken the set of integers as the universe for the keys of the array 
 only for the sake of simplicity in the ensuing discussion. However, any 
 set of keys on which a linear ordering is defined will do for this universe, 
 The domain of ptr values can similarly be enlarged. 
 
 
 
26 
 
 
 m 
 
 '!« 
 
 e 
 
 .1' 
 
 ■ 
 
 0311111 
 
 u :. 
 
 .ii(iii 
 
 ■€ 
 
 \ ir 
 
 Def 2 (Truth of Predicates): 
 
 The ptr predicates are interpreted as relations on integers 
 in the conventional way. 
 
 The array predicate array (s,t) < array (u,v) is true if either of 
 the arrays is empty (i.e., s > t or u > v), or if no element of 
 array (s,t) is greater than any element of array (u,v). (Similar 
 meanings are given to <, >, and > relations between array segments.) 
 
 The array predicate sorted (s,t) is true either if the array seg- 
 ment array (s,t) is empty, or if the elements of the segment are 
 arranged in nondecreasing order from the lower boundary s to the 
 upper boundary t of the segment. 
 
 A disjunct is of the form p, and p ? and . . . where each p. is a predicate. 
 A wff is of the form d, or dp or . . . where each d. is a disjunct. The 
 logical connectives and and or are interpreted in the conventional way. 
 
 Def 3 An interpretation M is said to satisfy a wff o>, notationally k. <j> 
 if <j> is true in M. The interpretation M is then a model for <j>. 
 
 Def 4 A wff ti logically implies a wff w, notationally Q f= u, if oj is 
 satisfied by every model of n. 
 
 Def 5 A wff <f>, is equivalent to <j> ?> o), (==| d) 2> if <j>, f= <j> 2 and $« Mi- 
 
 3.1.2 Outline of Theorem Prover 
 
 The general strategy of the proof procedure is as follows: The 
 wffs n, and w are in disjunctive normal form. If Q = n , or Q,~ o_r . . . 
 
27 
 
 then the proof of ft |= co is a collection of proofs of ft. (= oo. 
 
 ft |= a) 
 / 
 
 • • • 
 
 ■ • • 
 
 ft, (= co and ft ? f= oo and 
 
 ft. (= 
 
 CO 
 
 Given a disjunct ft,, and the conclusion co to be made from it, we construct 
 ft, from ft, using certain "inference rules." (For our purposes, an inference 
 rule is a procedure which transforms a given wff 4>, according to certain cri- 
 teria, into a wff <j> which is in a more convenient form than <j>.) The wff 
 ft, represents all that can be "deduced" out of the facts given by ft,, and 
 
 is equivalent to ft,. However, ft, is not necessarily a single disjunct. 
 
 * * * 
 
 Thus for each disjunct ft . of ft,, we "normalize" ft,, and co so that both wffs 
 
 use not only the same set of ptrs but also use the same set of array 
 
 segments. This may require "partitioning" the array segments they originally 
 
 #1 
 referred to into smaller segments. The wff co is rewritten as co using these 
 
 smaller segments. Thus, 
 
 :■■> 
 
 
 ft, f= CO 
 
 ft, |= 00 
 
 a* u H uo #1 
 
 and 
 
 .#i 
 
 * #2 
 
 ft „ f= co and 
 
 As we shall see later co is equivalent to co in the context of ft.. 
 
 M li 
 
 further have the property that if ft,, is satisfiable then 
 
 The ft, .'s 
 
28 
 
 °11 f" P iff P I s P 
 
 (3.1) 
 
 where P is a particular predicate of ft,, that "corresponds" to the predicate 
 
 #i #1 
 
 p of oj . Thus, if a) is a disjunct, 
 
 «11 •" 
 
 #1 
 
 GJ 
 
 A* 
 
 
 :a>iM 
 
 &ii r Pn and ^-| r p-|2 
 
 'n 
 
 Hp 
 
 n 
 
 '12 
 
 hp 
 
 12 
 
 #1 
 where u> = p,, and^ p,„ anjd 
 
 #1 
 
 #1 
 
 #1 
 
 #1 
 
 If oo is not a single disjunct, let it be oj, or oj„ where oj, 
 
 * #1 * #1 
 
 is a single disjunct. Clearly, if ft,, f oj, then ft,, f= oj . Otherwise, let 
 
 #1 * * 
 
 p.. be a predicate of oj, which is not implied by ft,,: ft, , f p. .. We then 
 
 consider two cases of the premise: 
 
 ft n h 
 
 #1 
 
 U) 
 
 * #1 #1 * #1 
 
 ft, , and p . . (= oj, or w 2 and ft, , and not p . . (= ou 
 
 We now take the transitive closure of the new premises ft,, and p.. to insure 
 that property (3.1) mentioned above holds, and repeat the whole process for 
 
 each of the new premises. 
 
 * # 
 
 An example of ft, oj and ft and oj is given in Figure 3.1. 
 
 3ol.3 Inference Rules 
 
 7c 
 
 We now describe a procedure for generating the aforementioned ft.. . 
 
 Ji 
 
 and oj from ft and oj. We will assume that ft and oj are disjuncts. The more 
 
29 
 
 ft = 2 < i + 1 < n and sorted(l,i) and sorted(i+l ,n) 
 
 
 co = array(l ,i ) < array(i+l,n) or x. > x. + . 
 
 
 array(l,n) is partitioned into 
 
 
 array (1 ,i-l); array (i,i); 
 
 
 array ( i +1 ,i+l ); array (i+2,n) 
 
 
 ft = 2 £ i and i + 2 £ n and 4 <. n and 
 
 
 sorted(l ,i-l) < x. and x. ( , < sorted(i+2,n) 
 
 
 #1 . #1 rt „ #1 , . ^ 
 
 co = co, or co ? where 
 
 
 #1 
 co, array(l,i-l) < array(i+2,n) and x. S arra 
 
 array(l,i-l) < x i+1 and x i < x^^ 1 
 
 y(i+2,n) and 
 
 
 #2 .. 
 w l = x i > x i+l 
 
 
 a i * 
 The predicate x. < x. + , of cof is not implied by ft . 
 
 
 The two new premises are: 
 
 
 * * 
 ft and x. < x.,, ft and x. > x- xl 
 
 
 * 
 The transitive closure of ft and x. < x. + , is: 
 
 
 2 < i and i + 2 < n and 4 < n and 
 
 
 sorted(l ,i-l) < x. and x. + , < sorted(i+2,n) and 
 
 
 sorted(l ,i-l) < x. ,, and x. < sorted(i+2,n) and 
 
 
 sorted(l ,i-l) < sorted(i+2,n) and x. <, x. ,-, 
 
 
 The predicate sorted(l ,i-l ) £ sorted(i+2,n) of above 
 
 "corresponds" to 
 
 the predicate array(l,i-l) <. array(i+2,n) of co? . 
 
 
 IIW 
 
 m 
 urn 
 
 •' 
 
 ISA 
 inn 
 
 "■! 
 
 * # 
 
 Figure 3.1 An Example of ft, co, ft and co 
 
30 
 
 general case will be dealt with later (see Algorithms 1, 2, 3 and 4 of 
 Section 3.2). The procedure consists of several subprocedures (infer- 
 ence rules) each performing a distinct transformation on a subset of 
 predicates of fi and u)„ We will find the following descriptions of the 
 effect of applying the inference rules useful. We are interested only 
 in "sound" inference rules: 
 
 a 
 
 «C 
 
 c 
 
 • 
 
 91 
 
 Canii 
 » 
 
 Willi 
 
 .■ami 
 
 e 
 
 * 
 
 w 
 
 .ill!'! 
 
 2 
 
 Def 6 An inference rule r is sound if it yields <j> when applied on 
 (J>, notationally <j> y 4> , such that <}> (= <j> . 
 
 Uef 7 An inference rule r is information preserving iff 
 {<$> ^-4> implies (J) h=U ). 
 
 Intuitively, no information carried by <J> has been lost in the process of 
 applying an information preserving rule 
 
 Not all inference rules are information preserving. For example, 
 our rule of local implication (Section 3.1 „ 3) lets us conclude that 
 (u + kp - v) from (u + k-, < v) whenever k 2 < k, . Clearly, this rule is 
 not information preserving. 
 
 Def 8 An inference rule r is an enriching rule if it yields <j> when 
 applied on <j> such that 
 
 1. r is information preserving, and 
 
 2. For e\/ery aR b of <j> , consider the predicates aRb of <j> 
 on the same variables a and b. Then aR b (= aRb but not 
 necessarily aRb |= aR b. 
 
31 
 
 Note that an enriching rule does not actually create new information, 
 but rather makes whatever information was present more readily usable. 
 All our inference rules, except the abovementioned rule of local implica- 
 tion, are enriching rules. 
 
 It will be convenient to describe the rules on a directed 
 graph representation of the wffs. The representation of a wff is the 
 collection of the representations of its disjuncts. Each disjunct $ is 
 represented as a pair of graphs — a ptr graph representing the conjunc- 
 tion of ptr predicates in the wff <j>, and a key graph representing 
 the array predicates. There is a coupling between these two graphs, 
 namely, the boundaries of the array segments of the key graph are defined 
 by the pointers. 
 
 The construction of thfe ptr graph tt of a disjunct <j> is des- 
 cribed below. A partitioned array (key) graph will be constructed later. 
 
 3.1.3.1 Graph Construction 
 
 The ptr graph tt will have a vertex for each ptr variable re- 
 ferred to in 4>. For each ptr predicate (u + k < v) in <|>, we put a 
 directed edge from u to v and label it "k,." Note that k, is a signed 
 integer, and the relation is always £. The graph tt may have more than one 
 edge from a vertex u to a vertex v. An example of a disjunct 4> and its 
 pointer graph appears in Figure 3.2. In constructing the ptr graph 
 the following equality axiom is embedded: for any ptr u, u + < u. 
 
 3.1.3.2 Subsumption for Ptrs 
 
 
 
 The graph, as constructed in the preceding section, may have 
 

 
 ■C 
 
 ! 
 
 32 
 
 (|> = 1 < j < i < n and array (l ,j-1) < x. and 
 
 array Q ,i) £ sorted (i+1 ,n) and j < i 
 (<J> is the premise of the verification condition VI of 
 Figure 2.3). 
 
 The ptr graph tt of <j> is 
 
 i'S 
 IfllKI I 
 fttai | 
 
 :» 
 
 }<a 
 
 II 
 
 :a 
 
 IS 
 
 III! 
 
 t 
 
 <l li 
 
 Mini 
 
 (j<i) 
 
 (j<i) 
 
 (Kj) 
 
 ■H i 
 
 (i<i) 
 
 (i<n) 
 
 Figure 3.2 An Example of a Disjunct and Its ptr Graph 
 
33 
 
 more than one directed edge from a vertex u to a vertex v The rule of 
 
 subsumption replaces all edges from vertex u to v by a single edge. The 
 
 label on this single edge is a label on one of the previous edges (see 
 
 Figure 3.3). 
 
 Let (k-,, k ? , . . ., k.} be the set of labels (integer constants) 
 
 on edges from u to v. Then we delete all these i edges, and replace them 
 
 by a single edge with the label k v = max {k, , k 9 , . . ., k.}. 
 
 max I c i 
 
 Remark 1 : The rule of subsumption for ptrs is an enriching rule. 
 3.1.3.3 Transitivity Rule for Ptrs 
 
 For any pair of edges (u + k < v) and (v + k 2 S w) of the 
 ptr graph, add a new edge (u + k, + k 2 ^ w). 
 
 By applying this rule to a ptr graph as long as it yields new edges, we 
 construct the transitive closure of this ptr graph. 
 
 Remark 2 : The transitivity rule is an enriching rule. 
 
 3.1.3.4 Partitioning of the Array 
 
 m 
 
 im 
 
 m 
 •v 
 
 We now come to a simple, but powerful, idea of the theorem 
 prover: express both premise and conclusion in terms of a common set of 
 array segments. To do this, we partition the array (l,n) into nonover- 
 lapping segments so that an array segment referred to in either of the 
 wffs ft or to is the union of a few contiguous segments in the partition. 
 Using these partitioned segments, equivalent wffs ft and go are obtained. 
 
34 
 
 A 
 
 
 
 Figure 3.3 Ptr Graph of Figure 3.2 after Subsumption 
 
 m 
 
 ingii 
 
 a'S ■ 
 ^ iftlKI I 
 
 ' Jliiin 
 
 3-0 I 
 i 
 It" 1 
 
 iSliii 
 
 IB 
 
 ! ilin 
 
 c: 
 
 :»iii 
 
 IP 
 
 Figure 3.4 Transitive Closure tt of the Ptr Graph it of 
 Figure 3.3 
 
35 
 
 Consider an array setment array (s,t). It partitions the 
 entire array into three segments: x oo . . . x , ; x . . . x. ; x. . . . 
 x ooi some of which may be empty. If we overlay one partition on another, 
 we obtain their product . The array partition needed to express n and u> in 
 terms of a common set of segments is obtained as the product of all indi- 
 vidual partitions defined by the array segments occurring in either Q. 
 or go. Each linear ordering of the boundaries used in o, or go defines a 
 partition of the array. 
 
 The partitioning procedure collects the relevant boundaries, 
 and produces all linear orderings of these boundaries in the context of 
 the partial ordering specified by the ptr graph it of n. 
 
 The set B of boundaries is constructed as follows: 
 Initially, B «- {-°°, +°°} 
 
 for each array segment array (s,t) referred to 
 either in 9, or in go do 
 B^BU {s-1, s, t, t+1} 
 endfor 
 Consider a maximal chain of the following kind, in the context of the 
 given ptr graph tt of £2 : 
 
 m 
 
 C : - = b Q b 1 b 2 . . b 2q b 2q+1 = +» 
 
 where for 1 < i < q 
 
 b-. and b ? . + , are boundaries in B, and 
 
 b 2i = ] + b 2i-T and 
 
 b 2i " b 2i+l 
 as implied by the ptr graph tt of ft 
 
36 
 
 If every boundary b of B either appears in C, or is equal to a boundary 
 appearing in C, then a composition (product) of the partitions is read- 
 ily obtained: 
 
 ■X 1 "" 
 
 
 .(.■J II 
 
 & 
 
 ^ 'It I 
 jC » 
 
 IB '•' ' 
 •e 
 
 |» ' 
 
 2!Mi 
 .pi: 
 4 ■«. 
 
 ; iuu 
 
 « ini|i|i 
 
 1 nil 
 
 : it;:; 
 
 .in. i 
 ■ 
 
 ' II" 11 
 
 < »a.i . 
 
 b Q to b^ b 2 to b 3 ; . . .; b,, to b 2q+1 . 
 
 However, if for some boundary pair b and 1 + b of B at least one of them 
 is not in C, then we resort to case analysis. We find the largest j such 
 that 
 
 b < b 
 
 Clearly, it is not known if b . , < b or b. + , > b. For otherwise, we 
 either have a larger j, or a longer chain. The wff ft is equivalent to 
 the disjunction of ft, , ft where 
 
 , = ft and (b < b. + , ) 
 Eft and (b > b. +] ) 
 
 Proving ft f= w is equivalent to proving 
 
 and 
 
 ft, Y 
 
 OJ 
 
 ft 2 Y w » 
 
 and in each of ft, and ft we can produce longer chains of boundaries 
 than in ft. We apply the same procedure of obtaining maximal chains of 
 boundaries on each of ft, |= oo and ft ? (= w. Clearly, this process will 
 
37 
 
 terminate. Let ft, , ft ? , ft_, . . . , fi. be the decompositions thus ob- 
 tained; in each of ft., the boundaries can all be put into one chain. I 
 Figure 3.5). The following lemma immediately follows: 
 
 Lemma 1 Let ft = ft. or_ ftp or 
 
 . . of ft„ where ft.'s are the result 
 C i 
 
 of decompositions of ft made while ordering the boundaries. Then 
 
 ft |= =| ft 
 
 Note that the disjuncts ft 1 , ft . . . , ft Q differ only in the ptr pre- 
 dicates; they all have the same set of array predicates. The ptr graph of 
 ft defines a partial ordering on the set of boundaries B. From this par- 
 tial ordering, say L, we are obtaining all linear orders L-, , Lp, . . ., 
 Lp. Hence 
 
 L Y = l L i 9L L o °H • • • 2H L c* 
 
 If ft = S and S where S and S are respectively the set of ptr, and 
 
 it a tt a 
 
 array predicates of the disjunct ft, then 
 
 ft. = L. and S . 
 i i a 
 
 Lemma 2 For i j j, (ft and ft . and ft,) is unsatisfiable. 
 
 Construction of Key Graph 
 
 We construct the key graph a of a disjunct <j> in the context of 
 a given partition of the array defined by a linear ordering L of boundaries 
 For each segment array (s,t), there are two vertices: minx (s,t) repre- 
 senting the minimum key, and maxx (s,t) representing the maximum key of 
 
 1 1 
 
38 
 
 
 .Hi'-! 
 
 c 
 
 u 
 
 • 
 
 >«!' 
 
 ■!>.,. 
 
 3 I: 
 
 3«»m i 
 ■mi < 
 
 IE" " f 
 
 llgi 
 
 IB: 
 
 
 Let a) e 1 < i - 1 < n and array (1 ,i-l ) < sorted (i,n) 
 and he 1 < j i i < n and S and j > a where 
 
 S a = array (l,j-l) < array (j,j) and array (l ,i) < sorted(i+l ,n) 
 
 (The verification condition V2 in Figure 2.3 is 9. |= co.) 
 
 The set of boundaries B = {-»,+«,0,l ,i-l ,i,n,n+l ,j-l,j,j+l,i+l} 
 
 Maximal chain C induced by the ptr graph 7T = l<j = i<noffi is 
 
 -°° 1 i-1 i n n+1 +°° 
 There are 2 linear orderings of boundaries in B 
 Q. |= =1 ft, or ftp* where 
 
 Q. = 1 < i = i < n and S 
 1 ° a 
 
 C, : -°° 1 i-1 i i+1 n n+1 +°° 
 
 partition : array (-°°,Q); array (l ,i-1 ) ; array (i ,i) 
 array (i+l ,n); array (n+1 ,+°°) 
 
 fi« = 1 < J = 1 : 
 
 Co 
 
 a 
 
 : -ooO 1 i-1 i i+1 +°° 
 partition : array (-°°,0); array (1 ,i-1 ) 
 array ( i , i ) ; array ( i +1 ,+») 
 
 Figure 3.5 An Example of Ordering Boundaries and Partitioning the Array 
 
39 
 
 the segment. The edges in the key graph are labeled using the subarray 
 relationships induced by the array predicates S of the disjunct <K The 
 following axioms are embedded in the construction of this key graph a 
 from the set S of unpartitioned array predicates. 
 
 - If array (s,t) is nonempty, then minx (s,t) < maxx (s,t) 
 
 - A subsegment of a sorted array segment is sorted 
 
 - If array (s,t) and array (u,v) are subsegments of a sorted 
 segment and if t < u then array (s,t) < array (u,v) 
 
 - If s = t then array (s,t) £ array (s,t) 
 
 - If two array segments are related by R, then their respec- 
 tive subsegments are also R-related. 
 
 An array segment array (sl,tl) is a subsegment of array (s,t) if s < 
 si < tl < t. The sorted (s,t) predicate is represented as a pseudo-binary 
 relation: array (s,t) sorted array (s,t). These axioms 
 are used to put labeled edges between vertices of the key graph : for 
 each array predicate ( array (s,t) R array (u,v)), we put an edge from 
 maxx (s,t) to minx (u,v) and label it with R (see Figure 3.6). We do 
 not construct a key graph in the absence of a partition. Thus, when con- 
 sidering a key graph a "context" is always present. 
 
 a 
 
 m 
 
 Remark 3: a and L (= =1 S and L, where L is the linear ordering of boun- 
 daries which defined the present partition. 
 
 Remark 4 : The negation of the predicate sorted (s,t) is not represent- 
 able. In Section 3.2.2, we prove (Theorem 4) that representing sorted (s,t) 
 will not be necessary. 
 

 •Ml 1 1 
 
 !»iiiih 
 
 |M 
 Htm 
 
 :m 
 
 mm 
 
 JIHllli 
 .Bill 
 
 II - 
 
 lit 
 
 .S.ii 
 
 40 
 
 f minx ( 1 , j -1 )j W maxx ( 1 , j -1 ) 
 
 Figure 3.6 Key graph for S in the context of the Partition C. of 
 Figure 3.5 a ' 
 
 3.1.3 5 Rule of Subsumption for Array Predicates 
 
 The rule of subsumption replaces all edges from a vertex v, to 
 ^2 by a single edge. The label on this single edge is a label on one of 
 the previous edges. 
 
 Let (k-,, kp, . . ., k.} be the set of labels (integer constants 
 and sorted ) on edges from v, to v 2 . Then we delete all these 
 i edges and replace them by a single edge with the label k = 
 max {k, , k 2 , . . ., k.}. For the purpose of this rule we 
 define the label sorted to be less than any other label. 
 
 The rule is identical to the rule of subsumption for pointer predicates. 
 Note that maxx (s,t) < minx (s,t) implies sorted (s,t) since all the ele- 
 ments of array (s,t) are then equal, while the converse is not true. For 
 
41 
 
 this reason, we defined the label sorted to be the smallest label (see 
 Figure 3.7). 
 
 Remark 5 : The rule of subsumption for array predicates is an enriching 
 rule. 
 
 (a) A Key Graph a 
 
 (b) Transitive Closure a 
 of Key Graph a 
 (self-loops not shown) 
 
 Figure 3.7 A Key Graph and Its Transitive Closure 
 
 3.1.3.6 Transitivity Rule for Array Segments 
 
 The transitive closure of key graph a is obtained in a way 
 similar to that of ptr graph tt. 
 
 If (v-! + k ] S v 2 ) and (v 2 + k 2 < v 3 ) 
 
 then 
 
 (v ] + k ] + k 2 < v 3 ) 
 
 Recall that the labels k can be either integer constants, or sorted o 
 For the purpose of this rule, we define the label sorted to satisfy: 
 
■•■■■•;•• 
 
 j 
 
 -d 
 
 ■ 
 
 
 IB 
 
 :»« 
 
 ;it». M 
 
 .»«■ 
 
 iilllll 
 
 lO'iim I 
 
 Kill I 
 
 IP*" 
 
 (h'Hiii i 
 
 I- 1 
 ]■'.'.. 
 
 •tan 
 
 IB! 
 
 i 
 
 • nam 
 j '■■■•■ 
 
 42 
 
 1. sorted < any other label 
 
 2. sorted + any label = sorted = any label + sorted 
 
 Remark 6 : The transitivity rule for array segments is an enriching rule. 
 
 3.1.3.7 Rule of Local Implication 
 
 This rule lets us conclude a global property like <J> f= p, where 
 p is a ptr or key predicate, and <j> is in a certain form, from a local 
 property that p |= p, the p being a particular predicate of $. 
 
 Def 11 A disjunct <j> = it and a is an enriched disjunct with respect 
 
 to a set B of boundaries if 
 
 * * 
 
 1. The ptr graph tt of $ defines exactly one linear ordering 
 
 L of the boundaries of B 
 
 2. The array predicates of a have been expressed using the 
 segments defined by the partition induced by the linear 
 ordering L of the boundaries 
 
 3. Both tt , and a are transitively closed 
 
 Let <!>, = tt.. and a, be an enriched disjunct. Let (Ju = tt^ and 
 
 oio be a disjunct such that the vertex-set of tt 2 is the same as that of 
 
 * * 
 
 tt, and the vertex-set of cto be the same as that of a-,. 
 
 Def 12 For each predicate (u Rp v) of 4> 2 , the corresponding predicate 
 
 * 
 in <fr, is defined as follows: 
 
 1. If there is an edge from u to v labeled with, say, R, in 
 
 a-, , the corresponding predicate is (u R-, v). 
 
43 
 
 * * * . 
 
 fi = tt and a , where 
 
 minx(i+l,n) J W maxx (i+l,n) 
 
 y.. ^sortea^ y. 
 
 a 
 
 is the transitive closure of Figure 3.6 
 
 #1 
 
 The conclusion co in 
 
 the context §f the partition 
 
 defined by ^ is 
 
 00 
 
 #1 . 
 
 immediately follows from Oj by the rule of local implication 
 
 * 
 Figure 3.8 An Enriched Disjunct ^ of Q of Figure 3ob 
 
■ :. 
 
 
 r 
 
 I 
 
 A 
 • » 
 
 
 fir 
 31 
 
 . < "SI 
 
 I » »>» 
 
 • Ihiiu 
 
 ' ~ :s»im 
 
 ■Stu 
 
 '•iml I 
 
 :s».!ii 
 
 Ha 
 
 44 
 
 2. If there is no edge from u to v in a., then the correspond- 
 ing predicate is (u null v), an empty predicate, which is 
 defined to be true in all interpretations. (Intuitively, 
 take null as "-°° <." 
 Consider a disjunct ox. of the conclusion u> to be made from an enriched 
 
 it U 
 
 disjunct n . Let w, be the rewritten version of w, using the partitioned 
 
 #1 
 array segments. Since u), is equivalent to oo-, , in this context, it follows 
 
 that 
 
 it * 4 
 
 n Y a), iff a |= oo^ 
 
 # 
 
 iff fi (= p for every predicate p of w. 
 
 Because ft is an enriched disjunct, we can make the following stronger 
 statement 
 
 for any predicate p of u>, , the corres- 
 
 * 
 ponding predicate P of ft is such 
 
 that P |= p (3.2) 
 
 ft* H u* i f f 
 
 We shall refer to (3.2) as the rule of local implication . A proof of 
 the validity of this rule is given in Section 3.2.2. 
 
 3.2 Basic Theorem Prover is a Decision Procedure 
 
 An informal, but complete, description of the basic theorem 
 prover is contained in Algorithms 1, 2, 3 and 4 in the following pages. 
 This section proves that the theorem power is a decision procedure for 
 flf u, where both ft and to are wffs. The theorem prover will be extended 
 in Section 3.3 to prove ft |= w where either or both of ft and o> can be 
 augmented wffs. 
 
45 
 
 That the basic theorem prover terminates follows immediately 
 by considering the "length" of the conclusion go. In Algorithm 3, and 4, 
 we delete either a whole disjunct, or a predicate of a disjunct from oo 
 in eyery iteration. In the present section, we shall be occupied with 
 the proof that the basic theorem prover gives correct answers, that is, 
 
 when the basic theorem prover terminates, the boolean 
 variable - istheorem - is true iff it is indeed 
 
 (3.3) 
 
 the case that the premise ft logically implies the 
 
 conclusion w. 
 
 The core of the proof is that the rule of local implication is 
 valid. The structure of the proof follows the structure of the algorithms 
 closely. We show (1) that the satisfiability of a graph is decidable 
 and that it is obtained as a by-product of transitive closure, and (2) 
 that ft |= p iff p follows from ft by local implication. 
 
 3.2.1 Model Construction 
 
 Given an enriched disjunct 4> = tt and a , we want to construct 
 a model for it by assigning values to ptrs and keys. 
 
 Without loss of generality, we can assume that the vertex is 
 
 present in the ptr graph tt , for, if it is not, introduce it by anding 
 
 tt with < 0. This vertex is then assigned the value zero. A model for 
 
 * * 
 
 tt is constructed first, and then a model for a is similarly constructed. 
 
46 
 
 
 
 jet ■ 
 
 • I 
 
 
 Oil 
 ;a* 
 
 MBS! I 
 
 ■a taui 
 
 S <•>:■■! 
 II HI' 
 i.-ii'!: 
 
 procedure basic theorem prover (ft : wff (= w : wff) 
 is theorem «- true 
 
 if a) is empty then oo •*■ false endif 
 for each disjunct ft, of ft and while istheorem dp_ 
 
 provetheorem ( ft, |= oj) 
 end for-while 
 
 endproc 
 
 Algorithm 1 
 
 procedure provetheorem (ft, : disjunct (= w : wff) 
 
 construct the ptr graph tt of ft,; apply subsumption. 
 
 If 7T is satisfiable 
 
 then {see Algorithm 5 of Section 3 3.1} 
 
 collect the boundaries referred to in ft,, and oj into 
 
 a set B. 
 for each linear ordering L, according to it, of 
 
 boundaries of B 
 and while istheorem do 
 
 construct the key graph a of ft, for this parti- 
 tion defined by L 
 apply subsumption on a 
 istheorem ■*■ provelemma (tt and L and a ^ oo) 
 end for-while 
 endif 
 endproc 
 
 Algorithm 2 
 
7T +- transitive closure of ptr graph of fi 
 
 11 
 
 * 
 
 a *■ 
 
 47 
 
 function provelemma (ft,, : disjunct |= reference oj : wff) 
 returns value of local islemma : boolean 
 
 transitive closure of key graph of ft,, 
 
 * * 
 
 islemma-*- (it and a is unsatisfiable) 
 
 if not (to is empty or islemma) then 
 
 choose a disjunct oo, of oo ; delete oo, from oo 
 
 , •*■ equivalent of oo, expressed using the partition 
 
 defined 
 
 W n 
 
 # 
 
 islemma +■ does (tt and a ) imply (oj, or oo )? 
 
 endif 
 
 endfunction 
 
 Algorithm 3 
 
48 
 
 c 
 
 
 !9*!l 
 
 Mm',: i 
 3|i I 
 
 < 
 
 31! 
 
 mo" 
 
 a i 
 
 3 " ■• i 
 
 ll lUidl! I 
 
 (I Mil I 
 
 4& 
 
 * # 
 
 function does (ft : enriched disjunct) imply (<*>.. : partitioned 
 
 disjunct or reference to : wff)? 
 returns value of local implies : boolean 
 
 repeat 
 
 choose a predicate p of go, ; delete p from to 
 
 a * 
 
 until u), is empty orfi,, ^ p 
 
 it 
 
 vf_ n n I s P {local implication} 
 
 # 
 
 then implies «- true 
 
 * 
 else implies «- provelemma (ft,, and p |= to, or to ) and 
 
 provelemma (ft-i-. and not p f= to ) 
 
 {see Theorem 4 of Section 3.2.2} 
 
 endif 
 
 endfunction 
 
 Algorithm 4 
 
49 
 
 Model for tt 
 
 The model for tt is constructed iteratively. Let 
 assigned = subset of vertices of tt to which values are 
 
 already assigned, such that for u-. , u ? e assigned 
 (u-, + k-, < Up) of tt is satisfied, i.e., valueof 
 (u,) + k-j < valueof (u 2 ). 
 
 * 
 This model is then extended by choosing an arbitrary vertex v of tt , 
 
 which is not in assigned . If no such vertex exists, then the construction 
 
 * * 
 
 of a model for tt is finished, and tt is satisfiable. Consider the fol- 
 lowing set: 
 
 * 
 S = predicates of tt to be satisfied by v 
 
 = {(u. + k. < v) u. e assigned} 
 1 i ' l a 
 
 U {(v + k, ^ u.) | u. e assigned } 
 
 J J J 
 
 The value to be assigned to v should be such that all the predicates in 
 S are indeed satisfied, and hence 
 
 assigned «- assigned U {v}, 
 
 provided the label on the self-loop at v is zero 
 
 Let V „ = max of vail and V . = min of valJ, where 
 
 max mm 
 
 vail = {valueof (u.) + k. | (u. + k. < v) e S} U {-«>} 
 
 valJ = {valueof (u.) - k. (v + k. < u,) e S} U {+°°} 
 
 J J J J 
 
£-x: 
 
 ■ 
 
 50 
 
 If ptr v is assigned a value V such that 
 
 V < V £ V • 
 max mm 
 
 (3.4) 
 
 ■ 
 i 
 
 ;W'" 
 3li 
 
 ski: 
 
 AW" 
 
 C9II 
 
 
 then all the predicates in S are satisfied,, However, if the self-loop 
 at v, (v + k < v) , is such that the label k > 0, this predicate is not 
 satisfiable in any interpretation, and any model construction cannot 
 proceed further,, 
 
 Now, assuming that the self-loop at v is labeled with zero, 
 we show that a value V satisfying the inequality (3.4) must exist. Let 
 
 max 
 
 be 
 
 ■5 
 
 valueof (u, ) + k, (without loss of generality on the subscripts 
 i of u. )» and V . be valueof (iu) - k 2 . Thus 
 
 (u-. + k-i _ v) e S and (v + k 2 < u 2 ) e S. 
 
 Since tt is transitively closed, (u, + k < u 2 ) e tt for some k greater 
 than or equal to k, + k 2 , and since u-. and u ? are in assigned , we have 
 
 i 
 
 (u, + k < u«) £ S 
 
 by our hypothesis on the set assigned , 
 
 valueof (u,) + k < valueof (u~) 
 and, therefore 
 
 V < V . . 
 
 max rain 
 
 The vertex v can, therefore, be assinged any finite value in the range 
 •-max' min-"* 
 
51 
 
 Model for a 
 
 The boundaries of array segments are defined by ptrs and hence 
 
 * 
 a model for a can be constructed only after a model for the ptrs is given 
 
 If array (s,t) is a segment used in a , we have, in general, 
 
 minx (s,t) < maxx (s,t) 
 
 If s = t then this becomes an equality. We remark that the labels or 
 self-loops of a cannot be negative. A positively labeled self-loop is 
 clearly unsatisfiable. Thus, these labels can only be either sorted 
 
 •k 
 
 or zero . For each pair of vertices minx (s,t), maxx (s,t) of a , we will 
 assign a single value. This assignment clearly satisfies all self-loops, 
 and sorted predicates. Once this decision is made, the model construction 
 for a is identical to that for tt . 
 
 3o2.2 Unsatisfiability and Local Implication Theorem 
 
 Theorem 1 (Unsatisfiabil ity Theorem) 
 
 The ptr graph tt is unsatisfiable iff the transitive closure 
 tt of tt has a self-loop whose edge label is positive. 
 
 Proof We know from Remarks 1 and 2 that 
 
 i i * 
 
 TT |- =| TT 
 
 * 
 
 Thus, tt is unsatisfiable iff tt is. 
 
 (^b) If tt has a self-loop at v with a positive label k, clearly 
 (v + k < v) is unsatisfiable in any interpretation. 
 
52 
 
 
 C "«' 
 
 X " 
 
 2 la* 1 
 „ $0" 
 * ■■ ■ 
 
 r 1 " 
 
 . in 
 
 id 
 
 (=^) It is well known that the transitive closure G of a graph G 
 will have a self-loop at a vertex v iff G has a directed cycle 
 
 (not necessarily of length 1) passing through v. The rule of 
 
 * 
 transivity is such that the label on the self-loop at v in tt 
 
 is not less than the run of labels of edges in any directed 
 
 cycle of tt passing through v„ Thus, if tt has no directed 
 
 * 
 cycle with positive edge-label sum passing through v, then tt 
 
 does not have a self-loop at v with a positive label. For such 
 
 * 
 a tt , we can indeed construct a model (see Section 3.2.1), and 
 
 hence tt is satisfiable. I 
 
 Corollary to Theorem 1 (Unsatisfi ability of Key Graphs) 
 
 The key graph a, in the context of an enriched ptr disjunct, 
 is unsatisfiable iff the transitive closure a of a has a self- 
 loop whose edge label is positive. 
 
 Theorem 2 (Local Implication Theorem) 
 
 * * i 
 
 Let tt be a transitively closed ptr graph. Then tt |= 
 
 (u + k ? < v) iff the corresponding predicate (u + k, < v) 
 
 * 
 in tt is such that k-, > k2<> 
 
 Proof 
 
 (4=) Obvious. 
 
 (-$?) we prove that if k-j < k 2 then tt Y ( u + k 2 - v )* Assi 9 n u an 
 
53 
 
 arbitrary value, and then assign v a value equal to valueof (u) 
 + k-.. Set assigned +(u,v}. Now, we can complete the construc- 
 tion of the model as in Section 3.2.1. Clearly, (u + k~ < v) 
 is false in this model „ I 
 
 Corollary to Theorem 2 
 
 Theorem 2 holds for a transitively closed key graph a , and 
 array predicate (u + k~ < v). 
 
 Theorem 3 
 
 Let ft be an enriched disjunct, and co a disjunct partitioned 
 with respect to the linear ordering of boundaries defined by 
 
 ft . Then 
 
 o* b # 
 ft |= 0) 
 
 iff 
 
 either, for every predicate p of to , 
 
 * 
 ft locally implies p, or 
 
 ft is unsatisfiable 
 
 Proof by Theorem 1 and repeated application of Theorem 2 Q | 
 
 When an enriched disjunct it and a does not imply a predicate 
 p of a),, we consider two cases (refer to Algorithm 4, if -statement): 
 
 7T TT 
 
 tt and a an_d not p f= to 
 tt and a and p |= (w, o_r w) 
 
 (3.5) 
 (3.6) 
 
 If p is sorted (s,t), not p cannot be represented in our scheme (Section 
 3.1.3.4). Hence a proof/disproof of (3.5) cannot be obtained in this 
 
54 
 
 deductive system. The following theorem avoids this problem by showing 
 that if p is a sorted-predicate, then the proof of (3.5) is equivalent 
 to the proof of (it and s < t) an_d a and minx (s,t) < maxx (s,t) f= w 
 when a ^ a sorted- predicate, p. | 
 
 Theorem 4 
 
 
 
 Let (J> be an enriched disjunct and array (s,t) be a segment 
 
 * * 
 
 in the partition defined by <j> . Further assume that <|> Y 
 
 sorted (s,t). Then 
 
 it an 
 
 <J> and not sorted (s,t) |= i> iff <j> = ifr 
 
 j|3 
 
 3 wf 
 
 Ift 
 
 M 
 
 ■<u 
 
 ■ ■Will 
 
 ra 
 
 I S 
 
 Proof 
 
 «-) 
 (4>) 
 
 where \p is a partitioned wff in the context of <j> , and <j> 
 
 * 
 <|> and s < t and minx (s,t) < maxx (s,t) 
 
 If <|> |= ijr then <j> and not sorted (s,t) = \p is obvious. 
 Suppose 4> and not sorted (s,t) ^ i|> . If 4> is unsatisfiable, 
 or if array (s,t) is empty, the theorem is trivially true. So 
 let <j> and not sorted (s,t) be satisfiable. Consider any model 
 
 M for (J) , p <j> . If sorted (s,t) is false in this model, then 
 
 # M 
 
 |= ]p . Given a model, any permutation of elements of sorted (s,t) 
 
 M 
 
 conserves the minx (s,t) and maxx (s,t). Thus, if we permute 
 
 the elements of array (s,t) in model M, all predicates of \\> , 
 
 with the possible exception of ( array (s,t) R array (s,t))-type 
 
 predicates, must still be true. Since \\> is a wff (in our 
 
55 
 
 system) there are only three possibilities for ( array (s,t) R 
 array (s,t)): 
 
 1. array (s,t) < array (s,t) 
 
 2 - a rray (s,t) < array (s,t) 
 
 3 » array (s,t) sorted array (s,t) 
 The first one is unsatisfiable in every interpretation. The 
 second one will still be true after permutation. The third 
 one was false in M, and if the permutation is an appropriate one 
 it may be true in the resulting model M But, if \p had this 
 
 sorted (s,t) predicate, (J> and not sorted (s,t), being a satis- 
 
 # * 
 
 fiable disjunct, cannot imply \p . Thus, if |= <$> and not sorted 
 
 # M 
 
 (s,t) then \p will be true in a model M also where M is the 
 
 result of permuting elements of array (s,t). That is, ^ will 
 
 be true in any model for <f> . | 
 
 3.2.3 Basic Theorem Prover is Correct 
 
 The structure of the proof of ft \= w, as constructed by this 
 theorem prover, is shown in Figure 3.9. Theorem 3 yields the proof at 
 the lowest level in Figure 3.9; the remaining proofs are proven by appro- 
 priate recursive/ iterative calls (indicated by dotted lines; see Algo- 
 rithm 1 , 2, 3, and 4). 
 
 We omit further details of the correctness proof of the basic 
 theorem prover. 
 

 •OS 
 
 ft 
 
 gl HIM: 
 
 i {, mm 
 
 It' J 
 
 _ ;|W' 
 
 4>l 'l.il 
 
 !S# 
 
 j&Q»ir 
 
 3 >j iM im 
 
 4 .B qi 
 
 56 
 
 Let ft = fi 1 or n 2 , where ft 1 is a disjunct, and ^ is a wff, possibly false 
 
 ft (= co -<- 
 
 ft, |= CO 
 
 and 
 
 ftp (= CO 
 
 for each linear ordering of boundaries of ft, and oo 
 
 prove ft, • |= co 
 
 let cj = co, p_r ojp (^o m y be em Pty) 
 
 — *\ 
 
 fi-j . |= p of cotj and ft,. (= c/ with 
 
 ft, • and not p (= cop 
 
 and ft, . and p |= co, o_r ojp 
 
 y 
 
 p deleted 
 
 if false 
 
 Figure 3.9 Structure of the Proof of ft |= 
 
 CO 
 
57 
 
 3.3 Evaluation of Backward Functions 
 
 Recall that the conclusion oo of the lemmas ft |= co to be proven 
 was an augmented wff, possibly involving the functions subst, exchb and 
 nsrtb in because of backward substitution (see Chapter 2). Similarly, 
 ft was an augmented wff possibly involving the functions subst and 
 unmodi f iedpartsof . To be able to use the basic theorem prover presented 
 in Section 3.1, we transform these augmented wffs by evaluating the func- 
 tions to produce simple wffs not involving any of these functions. 
 
 Strictly speaking, the evaluation of functions like exchb , 
 nsrtb , etc., cannot be considered part of theorem proving. However, we 
 include it here because it plays an important role in our theorem proving, 
 and because this evaluation is done in the midst of the theorem-proving 
 effort. 
 
 Given the lemma ft f= go to be proven, the subst functions, if any, 
 of ft and co are evaluated first. Let us call the resulting augmented 
 wffs SI and go. The premise Si can be considered to be ft, or ftp where ft-, is 
 a disjunct, and ft 2 is an augmented wff, possibly the wff false . We then 
 prove that ft, (= go, and that ft ? |= oj. In Section 3.3.2, we describe the 
 proof of ft, (= oo. (This procedure is used repeatedly for each disjunct 
 of ft.) The boundaries of ft, and oo are collected into a set B. Each 
 linear ordering of boundaries defines a partition of the array The boun- 
 daries collected are such that this partition has single element array 
 segments array (s,s) and array (t,t) for each exchange x with x. 
 statement (similarly for insert statements). It is then a simple matter 
 to evaluate the exchb and nsrtb functions. The resulting oo is a simple wff. 
 

 58 
 
 We now describe these two passes of evaluation in greater 
 
 detail 
 
 ■I ' 
 
 ,'l 
 
 MM 
 
 2'**' 
 
 4 ** 
 m jjp 
 
 < "r 
 
 ■ Ikatu 
 
 t i"8' 
 
 • laajW 
 
 (nil 
 
 4 <8q 
 
 3.3.1 First Pass of Evaluation 
 
 The evaluation of the subst functions is the simplest, and 
 constitutes the first pass of our evaluation. Clearly, before 
 
 subst t for u in \p 
 
 can be evaluated, all subst functions of the augmented wff \p must be 
 evaluated. Assuming that ty is free of subst functions, the ptr expres- 
 sion t is substituted for e\/ery occurrence of the ptr variable u in 
 the augmented wff i|;, which may have only exchb and nsrtb functions. 
 
 Remark 7: Let S be u ■*- t statement. Then, the entry assertion <j> E = 
 subst t for u i_n \p<~ generated for an exit assertion ^ is such that 
 
 if b (\>r. then |=, ^, and 
 
 if jf <|> E then f x ^ s 
 
 where I is the result of execution of S on an interpretation I. 
 
 The boundaries referred to in the conclusion a> and current 
 disjunct of the premise ft, are collected by Algorithm 5. As can be easily 
 seen, the boundaries included in B are such that the partition produced 
 is guaranteed to contain appropriate segments needed in the evaluation 
 of e xchb , nsrtb and unmodifiedpartsof in the second pass. 
 
59 
 
 B «*■ {-oo, +»} 
 
 for each array segment array (s,t) referred to either in 
 
 fi or in a) do 
 
 B^-BU {s-l,s,t,t+l} 
 endfor 
 for each exchb x with x. in ip occurring in gj do 
 
 B^-BU {s-l,s,s+l,t-l ,t,t+l} 
 endfor 
 for each nsrtb x below x. vn ip occurring in w do 
 
 B^-BU {s-1 ,s,s+l ,t-2,t-l ,t,t+l} 
 endfor 
 for each unmodifiedpartsof a wrt (s,t) occuring in ft, do 
 
 B^BU {s-1 ,s,t,t+l} 
 endfor 
 
 Algorithm 5: Collecting Boundaries 
 
 3.3.2 Second Pass of the Evaluation 
 
 The second pass is made for each linear ordering L of the 
 boundaries collected as above. None of the functions exchb , nsrtb and 
 unmodifiedpartsof changes the ptr expressions. While the first pass 
 has an effect only on ptr expressions, the second pass has its effect 
 only on the array segments, which depend on the context L. Again, the 
 evaluation of exchb and nsrtb is from inside out. 
 

 .''fan '•■' 
 
 0i <■* 
 
 ,ttZ 
 •to 
 
 it* 
 
 - '■""■>' 
 2, a* 
 
 4 iNMIll 
 
 c r:i 
 
 60 
 
 exchb 
 
 Assuming that \\> in a wff, that is, that \p is free of exchb and nsrtb 
 
 functions, 
 
 exchb x with x. in \j> 
 
 is evaluated in the context of the partition defined by the current 
 linear ordering of boundaries. The wff if; is expressed as \\> using the 
 partitioned array segments. Note that the partition produced will have 
 single element array segments A = array (s,s), and B = array (t,t) (see, 
 second for -loop of Algorithm 5). The exchb is evaluated by substituting 
 B for A, and vice- versa, in every array predicate of ty . 
 
 nsrtb 
 
 Again assuming that ty is free of exchb and nsrtb functions, 
 
 nsrtb x below x. in ty 
 
 is evaluated in the context of the present partition. The wff \\> is ex- 
 pressed as \\> using the partitioned array segments. Note that since 
 s-1, s, s+1 , t-2, t-1 , t and t+1 are included in the set of boundaries, 
 the partition produced willi have single-element segments array (s,s), 
 array (t-1, t-1), and array (t,t). The boundaries are already ordered, 
 and we consider the two cases s < t, or s > t. 
 
 Suppose s < t. Then the following transformations are made 
 on the array segments A = array (u,v) of the predicates of \p : 
 
61 
 
 1. If A is a subsegment of array (s,t-2) then A is redefined 
 as array (u+1 ,v+l ). 
 
 2. If A is the same segment as array (t-l,t-l) then A has 
 the new definition: array (s,s). 
 
 3. The definition of A is unchanged otherwise. 
 
 Now suppose s > t. Then the following transformations are 
 made on the array segment A: 
 
 1. If A is a subsegment of array (t+l,s) then A is redefined 
 as array (x-1 ,y-l ). 
 
 2. If A is the same segment as array (t,t) then A is rede- 
 fined as array (s,s). 
 
 3. Otherwise, the definition of A is unchanged. 
 
 Remark 8 : Let S be either an exchange x wi th x. or an insert x below 
 
 x t statement, and let ^ be the corresponding exchb x with x. j_n ^ s or 
 
 nsrtb x below x. in i/;~ statement where \\>~ is the exit assertion of S. 
 
 Then ij> s is true in M , which is the result of the execution of S on M, 
 
 iff <j>c is true in M, where M is a model for the context. More formally, 
 
 if L is the present linear ordering of boundaries and hL then 
 
 M 
 
 if |= tyj. then |=, i^ c , and 
 M t M b 
 
 if (^ <j) c then b*, ip Q 
 M L M b 
 
 Lemma 3 Let S be a straightline program segment, that is, S is a se- 
 quence of ptr assignment, exchange or insert statements, with ^ s as its 
 exit assertion, and let <j> E be the entry assertion of S obtained as above 
 
62 
 
 in the context of L. Then 
 
 if k <f> F then f=, ^, and 
 
 if k (J) E then p, ip $ 
 M 
 
 where M is a model for the linear ordering L of boundaries and M is 
 
 the result of execution of S on M. 
 
 Proof by repeated applications of Remarks 7 and 8.1 
 
 0fA« 
 
 MM 
 
 'l! ■ 
 
 IB 
 
 Id 
 
 I 
 
 
 unmodifiedpartsof 
 
 The description of the evaluation of 
 
 unmodifiedpartsof g wrt (s,t) 
 
 is somewhat complicated because of the details needed. Intuitively, 
 since the procedure called can permute the elements of array (s,t), all 
 predicates of a which depend on the strict subsegments of array (s,t) 
 
 are deleted from a . In addition, the predicate sorted (s,t), if pre- 
 
 # # 
 
 sent in a , is deleted from a . The complication arises from the pos- 
 sibility that the current linear ordering of boundaries partitions the 
 segment array (s,t) into smaller segments. In such a case, it will be 
 necessary to temporarily "join together" contiguous segments to see if 
 the entire segment array (s,t) is related to other segments of 
 array (-oo,s-l ) or array (t+1 ,+=»). 
 
 Let array (s,t) = S,; S ? ; . . .; S : that is, the S .s consti- 
 tute the p subsegments of array (s,t) from the boundary s to t. Then 
 the evaluation is done as described in Algorithm 6. 
 
63 
 
 <J> «- the wff false 
 for each disjunct a of g do 
 it. *■ ptr graph of a, 
 
 for each linear ordering L induced by tt do 
 (J) 1 «- L 
 let a, be the disjunct a, expressed using partitioned 
 
 segments 
 for each array predicate (ARB) of a, do 
 cases 
 
 neither A nor B is an Si : <L «- <L and (ARB) provided 
 
 (ARB) is not sorted A 
 
 A is an Si 
 B is not 
 
 B is an S. 
 l 
 
 A is not 
 A is an S,- 
 
 B is an S. 
 endcases 
 
 (Jx. *• <Jl and ( array (s,t) R B) provided 
 a. has predicates S,R,B, SpRoB, . . ., 
 S R B such that for 1 < j < p, S.R.B |= 
 
 r r J J 
 
 I I I 
 
 S.R B and if S.R.B h S.R B then 
 
 J J J J 
 
 s.r'b h s.r"b 
 
 as above with A, B exchanged 
 
 4>-| is unchanged 
 
 endfor 
 
 <}> +■ <f), or <b 
 endfor 
 
 endfor 
 
 Algorithm 6: Obtaining <J> = unmodifiedpartsof a wrt (s,t) 
 

 4 
 
 Ml 
 
 1 1 
 
 ,0| 'V. « 
 
 jig 
 
 Hiii 
 3» 
 
 iw a ii 
 
 n; 
 
 64 
 
 This completes the description of the evaluation of functions. 
 In the next section, we present the theorem prover with the extensions 
 required by the evaluation, and give arguments to establish the fact 
 that the extended theorem prover is a decision procedure. 
 
 3.4 Extended Theorem Prover is a Decision Procedure 
 
 Let us briefly review the backward substitution method (Section 
 2.2.2) of generating the verification conditions., To prove -C4> | P | ^> , the 
 program P is decomposed into straightline program segments S and we 
 then prove {(f) |S|^}, where <J> and ip are generated from Y, and the loop in- 
 variants given. Each {<J>|S|iJ;} is proven by proving the generated lemma 
 <f> (= <{>d> where (k is the entry assertion for S generated from ty by back- 
 ward substitution. We recall that the backward substitution of [King 1969] 
 is such that 
 
 if (= <j> then f=, \p t and 
 
 1 B j 
 
 if Yi 4>b then t*« ^ 
 
 (3.7) 
 
 where I is any interpretation, and I is the result of executing the 
 program segment S on I. Note that {<}) R |S|^} is a milder statement tha 
 (3.7). It follows immediately that, for any entry assertion <j>, 
 
 {<j> | S | ij,} iff 
 
 (J) f= <}> 
 
 B 
 
 (3.8) 
 
 In general, <j> B has many disjuncts which are unsatisfiable in every_ model 
 
65 
 
 of <}>, making it unnecessary to consider these The diligent reader may 
 
 have noticed that the contextual backward function evaluation of the 
 
 previous section may generate entry assertions <jv which do not satisfy 
 
 the property (3.7). The wff <|>- is essentially 4> B from which some dis- 
 
 juncts are deleted. 
 
 To illustrate this dramatically, consider the one-line program 
 
 segment S = exchange x. with x., the exit assertion ij>~ being sorted (1 ,n) 
 
 i j j 
 
 The <}> R generated by true backward substitution is the equivalent of that 
 slhown in Figure 3.10a. This <j> R does imply the property (3.7). (For 
 readability we have not written cj> in disjunctive normal form.) However, 
 the backward function evaluation in the context of 1 < i < j < n yields 
 <Jv shown in Figure 3.10b. As can be seen, <t>r- is much simpler than t|) R , 
 whose generation does not depend on the context of the given entry as- 
 sertion. 
 
 Theorem 5 (Validity of contextual backward function evaluation) 
 For a straightline program segment S, 
 
 {(f) |S| ip} iff 4>* |= <j> E 
 
 Proof Without loss of generality, we assume that the given entry as- 
 sertion is <f> , the enriched version of <j>. Thus, we wish to prove 
 
 {<!> | S | \l)} iff $* |= <J> F . 
 
 (3.9) 
 
 We actually prove a slightly stronger version than (3.9), namely, 
 
 * * 
 
 {<() S | i/>> iff for every disjunct <j>. of <|> , <j>. |= <j>_. 
 
66 
 
 
 4 
 m 
 
 ' ''I 
 ■u<ff 
 
 :«' 
 
 lESw 
 
 miM« 
 
 r:i 
 
 4 >(i.aa 
 
 ) < x, < sorted (i+l ,j-l) < x, < sorted (j+l ,n) 
 
 ) < x i < sarted(j+l ,i-l) < x. < s orted (i + l ,n) 
 
 ) < x. < sorted (i+l,n) 
 
 ) < x, < sorted(i+l,n) 
 
 %J 
 
 ) < x. < sorted (j+l,n) 
 
 ) £ x. < sorted(j+l ,n) 
 
 * ? 
 S = exchange x. wi th x. 
 
 * sorted(l,n) 
 
 <j> R = <}>p with context ' true ' 
 
 e 1 £ i <. j £ n and sorted ( 1,1- 
 or 1 £ j < i ^ n and sorted (1 ,j- 
 £r 1 £ i £ n < j a nd sorted ( 1,1- 
 or j < 1 < i 5 n and sorted ( 1,1- 
 or 1 < j < n < i and sorted (1 ,j- 
 ojr i < 1 ^ J ^ n and sorted (1 ,j- 
 ojr sorted ( 1 , n ) and ( i < 1 
 
 or i > n) 
 and (j < 1 
 or j > n) 
 
 (a) <J) R : Context-Free Backward Substitution (Simplified) 
 
 <j> F in the context of 1 < i < j < n 
 
 = sorted (1 ,i-l) < x. < sorted (i-l ,j-l ) < x i < sorted (j+l ,n) 
 
 (b) (j) • Contextual Backward Substitution (Simplified) 
 
 Figure 3.10 Contextual and Context-Free Backward Substitutions 
 
67 
 
 where <f>-. is the entry assertion of S obtained by evaluating the back- 
 ward functions in the context of L . , the linear ordering of boundaries 
 
 * * 
 
 defined by the (enriched) disjunct <j>. „ That <j>. (= <|>p. , say for i = 1, is 
 
 shown by proving 
 
 <$> R and L, |= =| <J> F , and L 
 
 El 
 
 1 
 
 Thus, (jjp. is 4>p, or (jv^ or ... . 
 <1> R and L, |= (j^-, and L, : 
 
 Suppose <j> and L, ^ <j>_,. Let M be a model for R and L, , such 
 
 that ^ <j)£i . Then ^^ and L, is not true in M. By Lemma 3 of Section 
 
 i 
 3.3.2, if b(j <j> E , and L, then p, ip, where M is the result of the execution 
 
 M 
 of S on M. Since M is a model for <j) R , this contradicts property (3.7). 
 
 ^ri and L i h <J>d and L, : 
 
 Suppose <|> F1 and L, is true in M, and Y 4>r. By property (3.7) 
 ti i M 
 
 ^i \\j. But by Lemma 3, if \= <J> F , and L, then (=, ip, a contradiction. I 
 M M ' M 
 
 The advantage in generating <j>p's rather than <|> R should be ob- 
 vious. The assertion <J> D will have as many disjuncts as there are linear 
 
 D 
 
 orderings of the boundaries collected from S and ^. Several of these 
 linear orderings are of no concern to us, since we only need to prove 
 that execution of S on an input staisfying <(> results in \p. We do not 
 care what S does on a linear ordering of boundaries contradicting the 
 partial order specified by <j>. 
 
.] 
 
 X 
 
 '* 
 
 F i 
 
 MS 
 
 ■ 
 
 C|:3 
 
 ;;»» 
 . O 
 
 : i a 
 
 ; ;s* 
 
 j ■::.!■ •• 
 Id 
 
 535 
 
 68 
 
 Since contextual backward function evaluation is valid, the 
 extended theorem prover is a decision procedure for ft f= go, where u and 
 ft are augmented wffs. 
 
 3.5 Counterexample Generation 
 
 Whenever the theorem prover determines that ft ^ go it is pos- 
 sible, in this system, to construct a model M for ft such that go is false 
 in M. However, it should be realized that M may not be a "counterexample 
 to the program." This is because even though {<|>|P|iJ>}, the loop invariants 
 given may not be strong enough to prove all the lemmas generated. Coun- 
 terexamples will, hopefully, provide clues for strengthening the loop 
 invariants. 
 
 Suppose ft ^ to. Then there must exist (see Algorithm 2: prove- 
 theorem) a linear ordering L, ptr graph tt and key graph a of a disjunct 
 ft, of ft such that 
 
 it and L and a ^ .go. 
 
 § 
 
 Let to be the partitioned version of go in the context of L, 
 
 go and_ L |= =| go and L 
 
 go = go-, or goq or . 
 
 # 
 
 . or go 
 
 The last call (from either Algorithm 2 or 4) of Algorithm 3: provelemma 
 gives a satisfiable disjunct ft,,, and go = go such that 
 
69 
 
 "n ¥ 
 
 U) 
 
 and for 1 < i < c, 
 
 • # 
 
 ft,, and a), is unsatisfiable. 
 
 * # 
 
 Thus, a model M for ft,, such that f co is a counterexample to (it and 
 
 m c 
 
 * # 
 
 L and a |= ca) and hence to ft = oo. Since ft,, ^ w , there must be a predi- 
 
 # * * 
 cate p in to such that ft,, ft p (see Algorithm 4). The disjunct ft,, and 
 
 not p is satisfiable, and a model can be constructed as in Section 3.2.1 
 
 * 
 for the transitive closures of ft,, and not p. 
 
70 
 
 4. GENERALITY 
 
 ■C 
 ■ 
 
 In the last two chapters, we have seen the successful applica- 
 tion of inference rules about partitioning, closure and local emplica- 
 tion in the verification of programs written and asserted in our 
 languages. Though these vital inference rules are developed here as the 
 result of severe constraints imposed primarily by the assertion lan- 
 guage, they do apply to a wider class of programs manipulating data 
 structures. We now give several examples to support this contention. 
 
 4.1 Constraints of the Present Verification System 
 
 . a: '•"{ 
 
 2 '3m 
 
 * ■;» 
 
 jiSS 
 
 i T9 
 
 4 i»n* 
 
 The verification system was designed with the specific goal 
 of being usable in SORTLAB to verify the correctness of student pro- 
 grams for sorting an array. Severe constraints were imposed on the 
 programming and assertion languages both to limit the class of programs 
 to sorting-type problems and to obtain a system that is usable in a 
 practical situation. Not all these constraints are technically necessary 
 for making the theorem prover a decision procedure, though they have 
 value pedagogical ly. 
 
 For example, the verifier can be enhanced quite easily to per- 
 mit many arrays, temporary variables, ptr expressions like j + 8, and 
 predicates like array (s,t) - 3 < array (u,v), which means 
 
 array (s,t) - 3 5 array (u,v) = V.V. (s^ist and u^j^v ■+ x.-3^x.) 
 
 However, if arbitrary assignments to array elements are allowed, it is 
 
71 
 
 not clear how the verifier can be extended to prove the key-preserving 
 property of solving algorithms. 
 
 It is not possible to characterize the class of programs 
 provable in this system except as those programs that can be written in 
 our programming language and for which sufficiently strong assertions 
 can be made in our assertion language. Theoretically speaking, all 
 computable functions are programmable in the programming language. How- 
 ever, for most computable functions strong enough assertions do not 
 exist in our assertion language that permit a proof that the correspond- 
 ing program computes the function. Thus, e.g., heap sort and several 
 merging programs can be written in the programming language, but strong 
 enough assertions to prove that these programs also sort do not exist 
 in our assertion language. 
 
 4.2 Partitioning 
 
 Several properties on a data structure can be expressed as 
 properties on its substructures, and by interrelationships among these 
 components. For example, 
 
 sorted (s,t) iff s ^ t or (for all u, s ^ u < t 
 sorted (s,u) 5 sorted (u+l,t)) 
 avl-tree iff empty- tree (r) or 
 
 avl-tree (left (r)) and avl-tree (right (r)) 
 and -1 ^ height (left (r)) -height (right 
 (r)) * 1 
 A typical verification condition ft |= w of a program aiming to produce such 
 a property on a data structure is of the following kind: the conclusion 
 

 ■t»l 
 
 9 
 
 *!•■<« 
 
 .dus 
 
 H 
 
 •X* 
 
 BE 
 a 
 
 S'ttttfc 
 . m 
 
 A II ji 
 
 72 
 
 aj refers to larger parts of a data object having the property, while the 
 premise Q, refers to smaller parts of the data object which have the same 
 (or similar) property and contains certain interrelationships between 
 these parts. Proving Q f= w becomes much simpler in such cases if both 
 ft and a) are expressed in terms of a set of common parts of the data object, 
 Partitioning is a technique which decomposes the data object into small 
 enough components so that every segment of data structure referred to in 
 ft or co is a union of some of these components. 
 
 4.3 Closure and Local Implication 
 
 Much of the inefficiency in general theorem provers can be 
 traced to their inability to choose appropriately those predicates 
 of the premise which would imply a certain conclusion. The rule of 
 local implication completely avoids this problem by specifying the pre- 
 dicate of the premise that determines if a given predicate of the con- 
 clusion follows from the premise. It should be noted that the rule 
 of local implication is valid only when the ptr and key graphs are 
 transitively closed. 
 
 A rule of local implication can trivially be formulated in any 
 deductive system if all possible inferences from the given premises 
 are collected as the closure of the premise. However, this may not be 
 practical either because it takes a long time or because the closure 
 is not finite. We therefore seek inference rules yielding only finitely 
 many inferences from given premises and obtain the closure of such 
 rules. In the context of proving lemmas about parti tionable properties 
 

 73 
 
 on data structures, it is generally possible to obtain this closure 
 rapidly, and to invent appropriate rules of local implication. 
 
 4.4 Examples 
 
 Several examples from the literature are used in this section 
 to support our contention that the techniques developed for SORTLAB are 
 in fact applicable to a wider class of programs. The treatment of these 
 examples is necessarily brief; we only indicate how a relevant partition 
 may be constructed. We also assume, without further ado, that ap- 
 propriate extensions are made to the programming and assertion languages 
 where necessary. 
 
 4.4.1 A Geometric Example 
 
 Consider finite plane maps which can be described using rectan- 
 gles with one side parallel to the x-axis, and the operations union (+), 
 intersection (•) and negation (~i) • Thus A + B is the map covered by 
 the rectangle A or B, A.B represents the map common to both A and B, and 
 -tA represents the map not covered by A. The shaded map shown below can 
 be described by several expressions. 
 
 16 15 
 
j 
 
 -i ii 
 
 ...II* 1 
 ri Xvui 
 
 : 
 
 is: 
 
 " ;*'« 
 
 
 74 
 
 For example, 
 
 (l-2-3-4)-"t(5-6-7-8)--i(9-10-ll-12) 
 
 (l-14-15-4)--)(5-6-7-8) + (13-2-3-16)-n(9-10-ll-12) 
 
 (l-2-3-4)-n(5-10-ll-8) + (6-9-12-7) 
 
 The problem we wish to consider is: given two expressions E, 
 and E 2 , decide if E, and E 2 are describing the same map. If the coordi- 
 nates of all points referred to in E-. and Ep are constants, the problem 
 is trivial. But, if the points are arithmetic expressions (with plus, 
 minus only) of free variables and constants, the problem can be answered 
 by decomposing the maps described by E-. and Ep as follows. 
 
 Let the rectangle A contain a corner p of another rectangle B. 
 Then, p splits A into four smaller rectangles A,, A«j A 3 and A. as shown 
 below. Repeat this process until none of the partitioned rectangles 
 
 > 
 
 splits 
 
 contain corners of other rectangles. Clearly, each original rectangle 
 is a union of some of these parti oned rectangles. If we now impose 
 a linear ordering on these partitioned rectangles (e.g., A precedes 
 B if the coordinates of the left-top corner of A are (x, ,y, ) and 
 that of B are (x 2 ,y 2 ) such that either x-, < Xp or x-, = Xp and y-j < y 2 ) 
 the original expressions E-, and Ep can be rewritten in a canonical form 
 now and E-, will be equivalent to E 2 if their partitioned expressions are 
 identical . 
 

 75 
 
 4.4.2 Simple Array Examples 
 
 All the verification conditions of the two examples given in 
 this section can be proven by partitioning the array as described in 
 Section 3.1 .3.4. 
 
 4.4.2.1 Binary Search 
 
 The example given in Algorithm 7 is a classical binary 
 search algorithm. The proof that the algorithm searches correctly a 
 sorted array x(m. . . n) for an element z does not depend on the index k 
 being equal to (i+j) div 2; this particular choice of k only makes 
 the algorithm more efficient (0(log 2 (m-n))). For the algorithm to 
 search properly it is sufficient that the function f be such that when- 
 ever i < j, i * f(i,j) < j. The verification condition for the loop is 
 
 sorted (m,n) and i < k < j and 
 
 ( z i sin- array (i,j) or z notin- a rray (m,n)) 
 
 1= 
 
 sorted (mn,) and i ^ k and x k - z and 
 (z i sin- array (i,k) or z notin-array (m,n)) 
 or 
 
 sorted (m,n) and k + 1 ^ j and x. < z and 
 (z i sin-array (k + 1, j) or z notin-array (m,n)) 
 The predicate notin-array is the negation of i sin-array , where 
 
 z i sin- array (s,t) = 
 
 s = t and x = z or 
 
 (for some u such that s S u < t 
 
 z i sin-array ( s , u ) or z i sin- array ( u+1 , t ) ) 
 
76 
 
 - " 
 
 r ■■ 
 'I 
 
 H 
 
 em 
 ■ 
 m 
 
 to 
 
 3!\:: 
 
 3 ;! * 
 
 !» 
 2 
 
 « ;.. v 
 
 *sorted (m,n) 
 
 i ■*■ m; j -*• n 
 while i < j do 
 
 k «■ f(i,j) 
 if x k < z 
 
 then i «- k + 1 
 
 else j ■*■ k 
 
 endif 
 * i S j and (z i sin- array (i,j) or z notin-array (m,n)) 
 endwhile and sorted (m,n) 
 
 found -*- (x- = z) 
 * (found «-*■ z i sin- array (m,n)) 
 
 Algorithm 7. Classical Binary Search 
 
77 
 
 4.4.2.2 Dutch National Flag Problem 
 
 The problem is to rearrange the elements of an array x which 
 are those-val ued viz., either red, white or blue, into contiguous red-, 
 white- and blue-colored segments from the low end to high end respectively. 
 [Dijkstra 1976]. A solution to the problem is given here as Algorithm 
 8. The predicates red, white, blue or array segments are defined as 
 fol lows: 
 
 c(s,t) = (s < t and for all u such that s < u < t 
 
 c(s,u) and c(u+l ,t) 
 or s = t and color (s) = c 
 or s > t) 
 
 where c is to be substituted by red , white , or blue . The backward 
 function evaluation, and partitioning technique of Chapter 3 are adequate 
 to prove the partial correctness of this algorithm. 
 
 4.4.3 Heap Sort 
 
 Algorithm 9 [Floyd 1964] imposes the structure of a binary 
 tree on the array to sort its elements. We formulate the si ft- up 
 algorithm recursively; an iterative version of this algorithm is not prov- 
 able using our partitioning technique (see Section 4.5). The predicates 
 ordt, x - tree (•,*) are defined below: 
 
78 
 
 
 f 
 
 I 
 
 X 
 
 ■c 
 
 
 ...1 
 
 MM 
 
 Kim 
 
 up 
 
 "HrlO 
 
 es 
 
 r ■*• 1 ; w ■*• 1 ; b «- n 
 while wS b do 
 
 cases color (w) of 
 white: w «- w + 1 
 red: ( exchange t with t ; 
 r +,r ;■+ 1 ; vt + w + 1) 
 blue: ( exchange x with x. ; 
 b *■ b - 1) 
 end cases 
 
 * red (1 ,r-l ) and white (r,w-l ) and blue (b+1 ,n) and 
 
 l£r*wsb)£n+l 
 endwhile 
 
 * red^ (l,r-l) and while (r,w-l) and blue (w,n) 
 
 and l<r<w=b + l<n + l 
 
 Algorithm 8. Dutch National Flag 
 
 i 
 
79 
 
 x ^ tree (s,t) 
 
 ordt (s,t) 
 
 x > ordt (s,t) 
 heap (s,t) 
 
 (s £ t and x £ x and x ^ tree (2s, t) 
 v u s u 
 
 and x > tree (2s+l ,t) 
 
 or s > t) 
 
 (s < t and x ^ ordt (2s, t) 
 
 and x s > ordt (2s+l,t) 
 
 or s > t) 
 
 (x ^ tree (s,t) and ordt (s,t)) 
 
 (s < t and heap (s+1 ,t) and ordt (s,t) 
 
 or s ^ t) 
 
 Since our interest here is to demonstrate the applicability of 
 
 the principle of partitioning, we shall take the liberty of simplifying 
 
 the verification conditions. A crucial verification condition of si f tup- 
 procedure is: 
 
 {j = 2i < n and x- < x. and x. -, 5 x. and 
 ordt (2j,n) and ordt (2j+l,n) and 
 x. ^ tree (j,n) and x- ^ ordt (j+l,n) 
 [ call siftup (j,n)| 
 ordt ( i , n ) } , 
 
 (4.1) 
 
80 
 
 ■ 
 
 
 ••11 
 
 3 
 
 a 
 
 "is> 
 
 SS 
 IE* 
 
 a 
 
 jlSS 
 
 procedure si f tup (i,n) 
 * ordt (wi ,n) and ordt (2i + 1 , n) 
 j <- 2 * i 
 if j $ n then 
 if j < n then 
 
 If x i < x -j + i then j +• j + 1 endif 
 
 endif 
 
 if x- < x. then 
 
 J 
 
 exchange x^ with x.; 
 
 * ordt (2j,n) and ordt (2j + 1 , n) and x. £ tree (j,n) 
 
 and x. * ordt ( j + 1 , n 
 
 call siftup (j,n) 
 endif 
 endif 
 
 * ordt (i,n) 
 endproc 
 
 Algorithm 9(a). Recursive Siftup Algorithm 
 

 81 
 
 procedure heapsort (n) 
 
 for i n div 2 down to 2 do 
 call siftup (i ,n) 
 
 * heap (i,n) and 2 < i ^ n drv 2 
 endfor 
 
 for i *■ n downto 2 do 
 
 call siftup (1 ,i ); 
 exchange x-, with x. 
 
 * heap ( 2 , i - 1 ) and array ( 1 , i - 1 ) < sorted ( i , n ) and < i £ n 
 endfor 
 
 * sorted (1 ,n) 
 endproc 
 
 Algorithm 9(b). Heap Sort 
 
82 
 
 • 
 
 assuming that si f tup does not change the order of elements in any tree 
 (s,n) unless the tree is a subtree of tree (i,n). The Lemma (4.1), 
 therefore, reduces to: 
 
 j = 2i < n and 
 
 ordt (2j,n) and ordt (2j+l,n) and 
 
 x. ^ tree (j,n) and x. £ ordt (j+l,n) and 
 
 ordt (j,n) 
 
 ordt (i ,n) 
 
 (4.2) 
 
 where call siftup (j,n) has added ordt (j,n). The relevant partition 
 of the "array" is not decomposing into contiguous array segments but to 
 decompose the tree (i,n) into its two subtrees tree (2i,n) and tree 
 (2i+l,n) and the root x. . The proof of (4.2) requires consideration of 
 three cases: 2j > n, 2j = n, and 2j < n. To demonstrate the use of 
 a partition of the above type, consider the most interesting case 2j < n 
 We can rewrite (4.2) as: 
 
 j = 2i < n and 2j < n and 
 ordt (2j,n) and ordt (2j+l,n) and 
 
 ^ tree (2j,n) and x- ^ tree (2j+l ,n) and x- ^ x- and 
 
 £ ordt (j+1 ,n) and 
 
 x i 
 
 x j 
 
 1= 
 
 x.- 
 
 x 2i 
 
 x i 
 
 £ ordt (2j,n) and x. > ordt (2j+l,n) 
 
 (4.3) 
 
 x ?i anc * x i * ordt (4i,n) and x. > ordt (4i+l ,n) and 
 ordt (4i ,n) aj 
 2 ordt (2i+l,n) 
 
 > ordt (4i,n) ami x~. ^ ordt (4i+l,n) and 
 
83 
 
 As can be seen, the conclusion follows from the premise if 2i is 
 substituted for j. 
 
 The verification conditions for the two for- loops of heapsort 
 (Algorithm 9(b)) require even more complex partitioning: a decom- 
 position into subtrees as well as into array segments of one-element. 
 However, an iterative version of the siftup-algori thm does not yield to 
 such a decomposition of the heap, and hence is not provable by our 
 techniques. 
 
 4.4.4 A List Moving Algorithm 
 
 Algorithm 10 [Reingold 1973, Wagner 1974] moves all nodes 
 of a list structure accessible from a root to a new contiguous set 
 of nodes. We outline a proof of the fact that what is copied by the 
 algorithm is isomorphic to the original list structure composed of all, 
 and only, those nodes accessible from the root. For convenience in this 
 proof, we have introduced the tables copyof [•] and origof [•], and 
 boolean flags copied [•]. The original node, origof [q], of the newly 
 copied node q is not required by the algorithm itself; the tables 
 copied [•] and copyof [•] may be overlapped with the left [•] fields of 
 the original nodes (see Wagner 1974). The predicates in the loop 
 invariant are defined below: 
 
 isocopy (q) = (q = or 
 
 i so copy (q-1) and data [q] = data [q ] and 
 right [q] = copyof [right [q Q ]] and 
 left [q] - copyof [left]q Q ]]) 
 
84 
 
 
 4 
 
 Hi 
 
 tf;J| 
 
 3 
 
 g 
 
 'Urn 
 •m 
 ■»m 
 » 
 
 •an 
 3 1 IS* 1 
 
 *o 
 
 | ':»•' 
 
 £ I at 
 < <» m 
 
 procedure movelist (root) 
 p «- 0; q «• 0; £ ■*- root 
 call copy (£) 
 while q p do 
 
 q + q + 1 
 
 call copy (left [q]) 
 
 call copy (right [q]) 
 
 * isocopy (q) and q to p and p from q and dupe (q,p) 
 endwhile 
 
 * isocopy (p) and p to p and p from p 
 endproc 
 
 procedure copy ( var x) 
 vf x f nil then 
 if not copied [x] 
 then p «*- p + 1 ; 
 
 node [p] «- node [x]; 
 copied [x] +■ true ; 
 copyof [x] + p; 
 origof [p] •*■ x 
 endif 
 
 x ■*• copyof [x] 
 endif 
 endproc 
 
 Algorithm 10 
 
 A List Moving Algorithm 
 
85 
 
 where q = origof [q]. (The nodes 1 through q constitute an isomorphic 
 copy of a substructure of the original list.) 
 q to p 
 
 = (q = or 
 
 q-1 to p and left [q) £ p and right [q] < p or 
 q-1 to p-1 and (right [q] £ p-1 and left [q] = p or 
 right [q] = p and left [q] ^ p-1 ) or 
 q-1 to p-2 and left [q] = p-1 and right [q] = p) 
 p from q 
 = (q = or P from q-1 or 
 
 p-1 from q-1 and (p = left [q] or p = right [q]) o_r 
 p-2 from q-1 and p-1 = left [q] and p = right [q]) 
 
 (q to p means that all nodes reachable from q using right-left links 
 are included in 1 ... p. Similarly, p from q denotes the converse, 
 i.e., all nodes included in 1 ... p are reachable from nodes in 
 1 . . . q via the right-left links.) 
 
 dupe (s,t) = (s > t or s = t and node [s] = node [origof[s]] o_r 
 for some u, s ^ u < t and dupe [s,u] and 
 dupe [u+1 ,t]) 
 
 (Nodes from s to t are exact copies of their original nodes.) 
 
 The partition of the copied list structure as indicated by 
 the above definitions of the predicates readily gives a proof of various 
 verification conditions of the list moving algorithm. 
 
86 
 
 4.5 On the Applicability of Partitioning 
 
 As we have seen in the examples of the proceeding section, a 
 class of programs that typically have loops (recursive calls) operate 
 on their data objects building up the desired property iteratively 
 (recursively). Two general approaches are discernible in the iterative 
 build-up of properties: 
 
 Al . The data structure having a desired property P is 
 
 gradually built-up. If D is a segment of the data object 
 having property P, we find 6D, an incremental part from 
 the remaining part of the data object. The composite 
 segment D + <5D is manipulated so that D + 6D has the 
 property P. Repeat the process until all of the data 
 object has the property P [Misra 1976]. 
 A2. The desired property P on a data object is gradually built- 
 up. If D has a property Q, we manipulate D so that it 
 now has property Q which is "closer" to P than Q was. 
 The examples of Section 4.4.2 and 4.4.4 belong to class Al . 
 Partitioning seems applicable to all such programs. It is, of course, 
 possible to describe an algorithm belonging to class Al in terms of A2. 
 A bubble sorting algorithm can be thought of as converting an array that 
 is less-sorted to an increasingly-sorted array; however, the algorithm 
 is best put in class Al . On the other hand, there are algorithms belong- 
 ing to class A2 which it will be very difficult to describe in terms 
 of Al . A nonrecursive sift-up algorithm of heap sort (see, Floyd 1964 
 and Section 4.3) descends the tree confining the undesirable property 
 
87 
 
 that some tree is not ordered ( ordt ) to smaller and smaller trees. 
 
 This algorithm clearly belong to Class A2. 
 
 Thus, for partitioning to be applicable, it seems necessary 
 
 that the following requirements be satisfied: 
 
 Rl . The data structures used must have disjoint components. 
 (Thus circular lists, "trees" with shared structures do 
 not satisfy this requirement, while stacks, queues, 
 linear lists, trees, tables do.) 
 R2. It should be possible to describe the property P on data 
 object D equivalently in terms of the same property P on 
 components of D obtained by a finite decomposition, and 
 possibly some interrelationships among the components. 
 (Properties like A is a permutation of A , are not thus 
 partitionable, while those like T is an AVL-tree, array 
 A is sorted, or array A is a heap are.) 
 R3. The property P being sought should be built-up by the 
 algorithm using the approach Al . 
 
 When the desired property P, and data object D satisfy requirements 
 
 Rl and R2, it is generally possible to write programs that satisfy R3. 
 
 Thus, the applicability of partitioning depends not only on the intrinsic 
 
 properties of the data structure, and the property P, but also on how 
 
 P is built-up. 
 
• •• 
 
 I 
 
 I 
 
 f 
 
 .^1 
 
 i 
 ■1 
 
 |3i3 
 
 . Q i, J 
 
 ' fi.iw 
 
 I 
 
 an 
 
 g inn 
 
 a is* 
 
 5 >uvfC 
 
 4 'It All 
 
 S'l «' 
 
 4 Mm 
 
 88 
 
 5. SORTLAB 
 
 The verification system descirbed in Chapters 2 and 3 is at 
 the heart of a programming laboratory, called SORTLAB, which assists 
 the student- programmer in producing correct sorting algorithms from 
 basic ideas of these algorithms. SORTLAB consists of a program editor, 
 an interpreter, the program verifier described earlier and a counter- 
 example generator. These are implemented on the PLATO interactive system 
 as a "lesson." This lesson is a part of the Automated Computer Science 
 Education System (ACSES) developed by the Department of Computer Science 
 of the University of Illinois. 
 
 This chapter describes SORTLAB, its use and its implementation. 
 Sections 5. land 5.2 provide a context in which the performance of SORTLAB 
 should be evaluated. 
 
 5.1 PLATO 
 
 The PLATO IV interactive system [Alpert and Bitzer 1970] is 
 designed to support more than 500 users logged-in on the plasma-panel 
 graphic terminals. The users can be divided into "authors" who write 
 teaching-programs ("lessons"), and "students" who execute these lessons 
 at their own pace. It is expected that a user limit CPU usage to 2 
 milliseconds/clock-second; any attempted over-use will be reduced to 
 this level by offering fewer time-slices. 
 
 Each student-user has a data segment of 1,650 60-bit words. A 
 lesson is assigned a data space of 1,500 words in the central memory, 
 and it can access these 1,500 words and the first 150 words of student 
 
89 
 
 data segment. The 1 ,500- word space must be loaded (and unloaded) with 
 the contents of the remaining 1,500 words of student data segment or of 
 a segment of extended core storage containing information that is common 
 to all users executing the lesson. Thus any lesson using more than 150 
 words of data must explicitly control this "paging." 
 
 The single most annoying factor in the use of the PLATO system 
 for program development is TUTOR, the only programming language available 
 to authors, in which the lessons are to be written. (For a short intro- 
 duction, see Popular Computing 1975; a detailed, and a slightly outdated 
 description may be found in [Sherwood 1975].) TUTOR is a high-level 
 language with an assembly-language-like format. It contains several 
 machine-dependent data manipulative statements with such niceties as 
 nested assignment statements and generalized versions of the computed- 
 goto and do-loop statements of FORTRAN. Procedure blocks may be 
 defined, but there are no local variables. Each variable name must be 
 assigned an address by the programmer. Several variables with small 
 values may be assigned to different segments of the same 60-bit word. In 
 addition to these features, there are several statements that are useful 
 in judging the students response. The run-time system of TUTOR permits 
 nested procedure calls (recursive or not) at most 10 levels deep. Most 
 lessons written for PLATO have a simple structure; for these programs, 
 lack of control structures, local variables, etc. are not serious im- 
 pediments. Typically, such lessons also use little CPU-time. Most stu- 
 dents find it pleasant to "read" such lessons because of the near- 
 instantaneous response and excellent graphics. Any unpleasantness is 
 usually attributable to the author's style of writing his lesson. 
 
90 
 
 5.2 ACSES 
 
 3™* 
 
 s 
 
 _ >u«B 
 
 I ll «» 
 
 SIS 
 
 The Department of Computer Science of the University of Illinois 
 has developed on PLATO an Automated Computer Science Education System 
 [Nievergelt 1975] for beginning students in computer science. It con- 
 sists of a large body of lessons, a GUIDE information retrieval and 
 management system [Eland 1975] and an interactive programming system 
 [Wilcox 1973]. The GUIDE may be used by a student to find out about 
 his records or to choose a lesson of interest. The programming system 
 supports several languages with excellent error diagnostics. The body 
 of lessons largely consists of conventional Computer Assisted Instruction 
 lessons about various aspects of computer science. Among this collection 
 are two lessons which incorporate novel concepts of artificial intelligence 
 and program proving adapted to run on limited computer resources: 
 PATTIE [Danielson 1975], to tutor students in top-down program design; 
 and SORTLAB, to be presented in the next section. 
 
 53. S0RTLAB--A Programming Laboratory 
 
 SORTLAB concerns itself with the implementation of certain 
 sorting algorithms. It provides a "laboratory" wherein a student can 
 perform programming "experiments" using the various equipment provided. 
 It does not actively suggest what ways should be used in implementing an 
 algorithm, but focuses the student's attention on the correctness of 
 his program by providing such tools as specially-designed, and easy-to- 
 learn mini programming language, an excellent program editor, a program 
 verifier, a counter-example generator, and an interpreter for his programs. 
 
91 
 
 Program Programming 
 Editor Language 
 
 Recognizer 
 
 SORTLAB 
 
 Sorting 
 Program 
 Verifier 
 
 Assertion 
 
 Language 
 
 Recognizer 
 
 Verification 
 
 Theorem 
 
 Counter 
 
 Condition 
 
 Prover 
 
 Example 
 
 Generator 
 
 
 Generator 
 
 Figure 5.1 Components of SORTLAB 
 
 5.3.1 Programming and Assertion Languages 
 
 Interpreter 
 
 The languages are so chosen that while it is convenient and 
 natural to express several sorting algorithms, writing other programs 
 is not easy. The particular choice of basic operations in the program- 
 ming language, and predicates in the assertion language is strongly 
 influenced by decidability considerations (see Section 2.2). 
 
 A program example is given in Figure 5.2. The syntax of the 
 languages is specified in Figures 5.3 and 5.4. The assertion language 
 semantics is specified in Section 3.1.1. The ptr assignment, while, 
 if and call statements have the conventional meaning. The semantics 
 of other statements of the programming language is explained in the 
 examples below. 
 
92 
 
 *»■»* 
 
 V 
 
 — »oB 
 
 * 
 
 i— T 
 
 -X x 
 
 3 ..X 
 ■o V| 
 
 8- + 
 
 8^ 
 
 Q-X 
 
 V| 
 
 + 
 •"3 
 
 «£ 
 v| 
 
 x 
 
 v| 
 
 -8 
 
 o 
 
 O 
 
 -a 
 
 v| 
 
 X 
 
 A 
 
 .X + 
 
 X t- 
 
 + 
 v I a> 
 
 •r- <u \ 
 
 I— <U •!- T- x 
 
 r— .c 
 
 x: 
 
 •r-J S 
 
 *-S 
 
 CD 
 
 V| 
 
 + 
 
 < 
 V 
 
 X 
 
 vl 
 
 < 
 
 +■> 
 
 •r- 
 C T- 
 
 <U X 
 
 i v |qj •<-> c + I 
 
 r-} , — (O .|— t-j 
 
 I— -r- V .C 
 
 4- > ^ «j + 
 
 v| 
 
 + 
 
 •"3 
 
 <c 
 
 vl 
 X 
 
 vl 
 
 + 
 
 1*£ 
 
 5 
 
 X 
 
 vl 
 
 5 
 
 v| 
 
 X "-3 
 X 
 
 -C . 
 4-> V 
 
 X ^3 
 
 r- T-jX 
 
 c 
 a; 
 
 ^ 2 "i- X 
 
 v | <u 0) ^. 
 
 \ M- i — CD 
 
 <U •■- 1— T- C< 
 
 r-) to "O I r fl) , o 
 
 i— c^ jr v |p 
 
 id aixfl o fc 
 
 C X r- Q. 
 
 aj a> i "o 
 ^ c 
 
 X CD 
 
 r-* M (^"d" mio * r^ CO" C3"> * 
 
 O i — CMCO <* invo 
 
 r-^ oo 
 
 -o 
 a; 
 
 E 
 
 2l 
 
 _Q 
 
 C 
 
 o 
 
 c 
 o 
 
 +J 
 
 3 
 
 u 
 
 CD 
 X 
 (L) 
 
 CTi 
 
 C 
 
 o 
 
 U 
 0) 
 X 
 
 c 
 
 z. 
 
 Z3 
 Q 
 
 QJ 
 CD 
 fC 
 O- 
 
 >> 
 
 ra 
 
 Q. 
 lO 
 
 5 
 
 •r- -r- S- 
 3 3 O 
 
 O" O" w 
 
 UD 
 
 CD 
 
 S- 
 4-> 
 
 C\J 
 
 CD 
 
 s_ 
 en 
 
 4 ^ + 
 
 o 
 
 
 
 
 
 
 O 
 
 o 
 
 un 
 
 ^d- 
 
 co 
 
 r— 
 
 CM 
 
 o 
 
 + 
 
 
 
 
 
 
 1 
 
 CO 
 
 c 
 
 a 
 
 CD 
 X 
 
 a; 
 
 c 
 o 
 
 S- 
 
 a> 
 to 
 to 
 
 03 
 
 * 
 
 o| 
 I5> 
 
 Q. 
 
 to 
 
 s_ 
 
 S- 
 
 M3 
 
 CO 
 
93 
 
 <procedure> 
 
 <stmt list> 
 <stmt> 
 
 <while> 
 <scan> 
 
 <if> 
 
 <ptr-assign> 
 <exchange> 
 <insert> 
 <call> 
 
 <optional out par var> 
 
 <input par> 
 
 <optional out par exp> 
 
 <bool exp> 
 
 <pl-disjunct> 
 
 <pl-predicate> 
 
 <ptr pred> 
 
 <key pred> 
 
 <nerel> 
 
 <rel> 
 
 <updn wi th> 
 
 <ptr exp> 
 
 <ptr var> 
 
 := procedure <identifier><input par> 
 
 <optional out par expxstmt list> endproc 
 := {<stmt>}* 
 := <ptr-assign>|<exchange>|<insert>|<call>| 
 
 <while>|<scan>|<if> 
 := while <bool exp> do <stmt list> endwhile 
 := scan <updn withxptr var> from <ptr exp> 
 
 to <ptr expxstmt list> endscan 
 ::= if <bool exp> then <stmt list> else 
 <stmt list> endif 
 = <ptr var> «- <ptr exp> 
 = exchange x <ptr exp> with x <ptr exp> 
 = insert x <ptr exp> below x <ptr exp> 
 = call <proc identifier (<ptr exp>, <ptr exp>) 
 
 <optional out par var> 
 = <empty>|*(<ptr var>{,<ptr var>}*) 
 = (<ptr var>, <ptr var>) 
 = <empty>|*(<ptr exp>{,<ptr exp>}*) 
 = <pl-disjunct> {ojr <p"l-disjunct>}* 
 = <pl-predicate> {and^ <pl -predicated* 
 = <ptr pred>|<key pred> 
 = <ptr expxnerelxptr exp> 
 = x <ptr exp><nerel>x<ptr exp> 
 = <rel > | =f= 
 
 =<l * I = I * I > 
 = up with | down with 
 
 = 0|l|2|<ptr var>|<ptr var> ± 1 
 
 = i | j | k 1 1 1 m | n 
 
 Figure 5.3. Syntax of the Programming Language 
 
94 
 
 4% 
 
 <assertion> : 
 
 := <disjunct> {or <disjunct>}* 
 
 <disjunct> : 
 
 := <predicate> {and <predicate>}* 
 
 <predicate> : 
 
 := <ptr predicate>|<array predicate> 
 
 <ptr predicate> : 
 
 := <ptr exp>{<rel><ptr exp>} 
 
 <array predicate> : 
 
 := sorted <segment def>|<segment> 
 
 
 {<rel><segment>} 
 
 <segment> : 
 
 := array <segment def>| sorted 
 
 
 <segment def>|2< <ptr exp> 
 
 <segment def> : 
 
 := (<lower boundary>, <upper boundary>) 
 
 <lower boundary> : 
 
 := <bounda ry> 
 
 <upper boundary> : 
 
 := <boundary> 
 
 <boundary> : 
 
 := <ptr exp> 
 
 
 
 Figure 5.4. Syntax of the Assertion Language 
 
95 
 
 The statement 
 
 scan up with i from j + 1 to k - 1 
 
 <body> 
 endscan 
 is equivalent to 
 
 i * j + 1 
 
 while i ^ k - 1 do 
 <body> 
 i «- i + 1 
 endwhile 
 The loop variable i of the scan statement is not considered unmodifiable 
 by the body. 
 
 The statement " insert xi below xj" is equivalent to the following 
 abstract program: 
 
 t «- x.j ; p +■ i 
 i_f i - j then 
 
 while p £ j - 2 do x «- x , ; p «- p + 1 endwhile 
 {circular up shift} 
 else while p < j + 1 do x +■ x -, ; p +■ p - 1 endwhile 
 
 {circular down shift} 
 end if 
 

 f 
 
 •» 
 
 01-** 
 
 t MM 
 
 .'Ml 
 
 311! 
 
 IB*" 
 
 E 
 
 a 
 
 SIB 
 
 96 
 
 A program, in SORTLAB, is a collection of procedures and it 
 always includes the main procedure "sort." All procedures are external 
 and may be recursive. The array x is global to all procedures; indices 
 are always local. Thus, the only way a procedure may receive an index 
 value is by receiving it as a (value) parameter. 
 
 Notice that apart from the array to be sorted x, and ptr vari- 
 ables, no temporary variables are provided. Two padding elements x , and 
 x + , are predefined to be -» and +°° respectively; these may be used 
 as sentinels. Thus, the entry and exit assertions of main procedures 
 sort (n) are: 
 
 n ^ 1 and x < array (1 ,n) < x , 
 
 sorted (1 ,n) 
 
 5.3.2 Language Recognizers 
 
 The tokens of the programming and assertion languages are so 
 chosen that (except for if, insert, and i «•. . .) they can be recognized 
 by their first character. As soon as the first character of the token 
 is typed, the statement is completed as far as possible and is displayed. 
 An illegal key-press causes it to be flashed and is ignored. Thus, in 
 writing the following statements only the underlined keys need be pressed: 
 
 scan down with 1_ f rom H "to 2 
 
 endscan 
 
 exchange xi^l with xj+1 
 
 5.3.3 Program Editor 
 
 Each procedure constitutes a "display page," and these may be 
 
97 
 
 selected by typing in the name of the procedure. A statement is inserted 
 by first giving a line number to it and then writing the statement. An 
 assertion is given as the exit assertion of a statement; the assertion is 
 displayed at the end of the statement. Thus, the line labeled 16* in 
 Figure 5.2 is the exit assertion of the if -statement at line 11. It is 
 also the loop invariant of the while- loop at line 4. Any sequence of 
 statements can be deleted and, if so desired, saved. A segment from among 
 several of such saved program segments may later be inserted into a pro- 
 cedure. 
 
 Compound statements 1 ike the while- statement are written in two 
 steps: first, the while- envelope with its corresponding endwhile and 
 without a body is written. At a later time, the body is formed either 
 as a sequence of new statements, or by inserting a saved program segment. 
 Thus, a number of simple, but common, errors, like unmatched end-brackets 
 of statements, unintentional nesting of bodies because of a missing 
 begin , end , or semicolon, do not arise. Further, structural changes of 
 a procedure do not require reparsing. Every structural change results 
 in a new page displaying the updated version, with automatic indentation, 
 of the procedure. 
 
 A number of ideas incorpoarted into this editor are originally 
 due to [Hansen 1971 ]. 
 
 5.3.4 Interpreter 
 
 The interpreter can execute any program written in the program- 
 ming language. The assertions are also executed, and their truth 
 value at run-time is indicated. It is possible to execute the program 
 

 
 
 J- 
 
 MttVt 
 
 So 
 
 i 
 
 3c 
 
 2 • *- 
 
 •viz 
 ;:sas 
 
 junn 
 
 Ik 
 
 98 
 
 in various modes, including step-by-step node. During execution, the 
 contents of the array being sorted is dynamically displayed along with 
 the location of various indices (Figure 5.2). Only the currently active 
 procedure are displayed; as each new procedure is entered, that procedure 
 is displayed. An invocation trace is also displayed. 
 
 The interpreter carefully checks for all possible violations 
 of the assumptions made by the verification system: Each procedure is 
 assumed to permute only the elements of the array segment between the two 
 imput parameters of the procedure (1 is an "implicit" input parameter 
 of procedure sort ; this prevents it from becoming a recursive procedure 
 since each call statement must have two input (actual) parameters!). The 
 values of all index variables should be between and n + 1 where n is 
 the size of the array; once an index variable has a value outside this 
 range, it is not possible for that variable to have a legal value. 
 
 5.3.5 Sorting Program Verifier 
 
 The student requests that his program be verified when he 
 has completed writing it. The verifier then proceeds to verify his 
 program provided all the required assertions (an invariant for each 
 loop; an entry, and an exit assertion for each procedure; an entry 
 assertion for each call statement) are given. The process of verification 
 is not interactive. The student is informed only of the outcome of the 
 verification. If his program is not proven correct, the lemmas which 
 were false are indicated. He may then request a counterexample, or 
 proceed directly to edit his program. 
 
99 
 
 We emphasize that when a program is not proven correct, it 
 may be because strong enough assertions were not given. 
 
 5.3.6 Possible Extensions of SORTLAB 
 
 It seems possible to construct a "sorting expert" consisting 
 of such components as loop invariant generator, termination prover, 
 efficiency analyzer, elegance judger, and algorithms expert. Systems 
 similar in intent to these subcomponents have been designed in other 
 contexts. El spas [1973] describes how the efficiency of a program analyzed 
 automatically, a by-product being termination. Considerable literature 
 (see, e.g. [Wegbreit 1974]) has appeared on the automatic generation 
 of loop invariants. Ruth [1974] discusses a system which attempts to 
 give quality feedback to the student using built-in knowledge about 
 specific sorting algorithms like bubble sort algorithm. An elegance 
 judger may be readily constructed if that elusive characteristic, 
 "elegance," of a program is quantified in terms of measurable quantities 
 like the length of the proofs of correctness, number of statements, 
 variables etc. 
 
 The tutoring system SORTLAB would certainly be more attractive 
 with such a sorting expert. The construction of this component seems 
 doable, but is another project of same magnitude as the verifier. 
 
I 
 
 H I 
 
 PI 
 4 
 
 (nit* 
 
 j il.;* 
 P 
 
 J. MB 
 
 
 »1 ¥10 
 
 pai 
 
 35 
 
 100 
 
 6. DISCUSSION 
 
 Many verifiers have been constructed. Yet, none of them can 
 be considered a tool usable by ordinary programmers. The number and 
 variety of programs proven is small. Data structures more complex than 
 linear arrays or lists are handled unnaturally. More significant is 
 their lack of performance of these verifiers in terms of memory space, 
 and computation time needed. 
 
 This failure in making significant advances toward constructing 
 verifiers that are mechanical aids to program writing can be largely 
 attributed to the yery attitude taken in building several of the present 
 day verifiers. They all seem to start with the presumption: Given an 
 arbitrary program with assertions, prove it. Evidence is building up 
 that practically usable verifiers cannot be constructed unless the prob- 
 lem domain is limited, programs are well -composed, abstract data struc- 
 tures and operations are used, and properties of programs and data 
 structures are studied from a semantic viewpoint. Thus, we foresee not 
 one ultimate program verifier but a class of limited domain program 
 verifiers, each capable of proving/disproving a certain class of programs. 
 
 Section 6.1 elaborates these points. Section 6.2 describes a 
 few of the significant verifiers and theorem provers built so far. 
 
 6.1 A Critique of Program Verifiers 
 
 McCarthy [1963] was one of the earliest to recognize the need 
 to replace debugging of systems (computer programs, engineering systems, 
 etc.) by proofs that systems meet their specifications. Considering 
 
101 
 
 programs as mathematical objects, he goes on to show how statements 
 about programs may be proven. The theory developed by Floyd [1967] for 
 iterative programs is comprehensive and equates the correctness of the 
 program to the truthhood of a certain set of lemmas generated from it. 
 
 King [1969] constructed a verifier which mechanized both lemma 
 generation and proof. This clearly demonstrated the feasibility of an 
 automatic program verifier and became the pilot system for a dozen or 
 so systems to follow (see [London 1972]). Many of these verifiers are 
 the result of unfortunate marriages between a lemma generator and a 
 classic automatic theorem prover, and none can be considered to be sig- 
 nificantly superior to King's verifier. 
 
 6.1.1 Theorem Provers for Program Verifiers 
 
 Work on classic theorem proving always concerned itself with 
 
 the general problem of syntactically deducing that a given statement of 
 
 first-order logic follows from a set of axions (see, e.g., [Chang and 
 
 Lee 1974], and [Bledsoe 1975]). Pointing out some of the theoretical 
 
 impediments to automatic theorem proving, Rabin [1974] comments that 
 
 this work had such high hopes and aims as: 
 
 . . .to develop a theorem prover which will enable 
 them to solve mathematical problems, and hopefully 
 even difficult mathematical problems, by the com- 
 puter. If one wants to slide into the realm of 
 science fiction then one may talk about proving or 
 disproving Fermat's conjecture by an automated 
 theorem proving program. . . . 
 
 Since first-order logic is undecidable, one is looking only for efficient 
 
 semi-decision procedures which will produce proofs of statements which 
 
"' 
 
 "1 
 
 .1 
 
 
 1'W 
 4 .wan 
 
 1 
 
 Sic* 
 
 S M 
 
 3 
 
 •I W*" 
 
 A I.JI 
 
 102 
 
 are theorems and halt, and which may not halt on nontheorems. But, as 
 Rabin makes it plain, even in such theoretically decidable domains as 
 Pressburger Arithmetic (first-order sentences involving natural numbers 
 and the operation of addition only), to computationally determine if a 
 given sentence is true or false may be practically undecidable. 
 
 If verification is ever to replace debugging, verifiers should 
 be able to handle incorrect programs. That is, we need theorem provers 
 which are decision procedures for the lemmas generated. Thus, the pro- 
 grams that a verifier attempts to prove or disprove should be so limited 
 that the lemmas generated belong to a decidable domain. This can be 
 done only by carefully designing a language for assertions expressive 
 enough to allow all "legitimate" assertions one might want to make in 
 proving properties of programs from an interesting class of programs. 
 The theorem prover should then be a decision procedure for all sentences 
 in the assertion language. 
 
 Since even decision procedures may take impractical ly long to 
 decide if a sentence is true or false, they should be so engineered that 
 for a large subset of the lemmas that can be considered to be "naturally 
 occurring" in well-designed programs such decisions are made rapidly. 
 Thus, we may not mind if it takes super-exponential time to decide if a 
 verification condition of the following kind 
 
 { n | 
 i *■ i 
 

 103 
 
 is correct (because the programmer has the bad manners of misusing the 
 verifier to prove an irrelevant mathematical theorem that n implies w) 
 so long as the verifier gives correctness proofs of legitimate programs 
 quickly. 
 
 Furthermore, the lemmas generated in proving well -designed, 
 legitimate programs are not typical of manual mathematics. These lemmas 
 are shallow and follow fairly directly from (properly chosen) axioms and 
 inference rules. Clearly, it is impractical to include all lemmas to 
 be proven as the set of inference rules; a small number of inference 
 rules should be carefully tailored so that short proofs of naturally 
 occurring lemmas can be given rapidly. Two examples of theorem provers 
 so designed are [King and Floyd 1972] and the theorem prover described 
 in Chapter 3 of this thesis. 
 
 6.1.2 Effect of Program Composition 
 
 The structure and statements of a program clearly will have an 
 effect on its verification. Writing abstract programs using abstract 
 data structures has been advocated by such authors as Dijkstra and Hoare. 
 The solution to a programming problem is constructed using operations on 
 data structures that are natural to the problem. These operations and 
 data structures will then be written at a lower level of abstraction, and 
 so on, until all operations and abstract data structures are implemented 
 in the host programming language. The advantages of such an approach 
 lie in the factorization of detail at any given level of abstraction. 
 
104 
 
 -• 
 
 go 
 
 ' S;|W» 
 JW 3 
 
 ; i ■ 
 
 in 
 
 3& 
 
 Such abstraction is helpful not only to the human designer of 
 the program, but also to the program verifier. When data structures are 
 manipulated solely through designated procedures, properties related to 
 data integrity can be proven by considering these procedures independent- 
 ly of their invocations using generator induction [Hoare 1972]. Thus, 
 for example, that a sorting algorithm has only permuted the given ordered 
 set of elements can be shown by proving that the primitive operations 
 exchange and insert were element-conserving. 
 
 Another important advantage to be gained is that undecidable 
 domains of lemmas may be isolated in a program. Arithmetic operations 
 such as multiplication, division and addition which result in theore- 
 tically or practically undecidable domains can be grouped together and 
 their input/output relationships explicitly given. These relationships 
 may then be proven separately by ad hoc techniques. Often, such arith- 
 metic is not essential to the property of the program being proven. For 
 example, the division by 2 in binary search, and multiplication by 2 in 
 siftup of heap sort are not essential to the correctness proofs. The 
 only thing that matters for the correctness of the search is that the 
 interval of uncertainty be partitioned into two smaller subintervals. 
 
 These operations on data structures are generally implemented 
 as procedures. Only selected components of a data structure are modi- 
 fied by the procedures, keeping the remaining environment of the procedure 
 intact. However, the rules of inference about procedure calls such as 
 those given in [Hoare 1973] or in [Elspas et al . 1973] deal only with 
 "entire variables" (a whole array, a whole stack, etc.) and are weaker 
 
105 
 
 than they should be. That is, correct programs exist which cannot be 
 
 proven using such inference rules. A "predicate transformer" (a la 
 
 [Dijkstra 1976]) offers a solution to this problem. 
 
 The rule of procedure invocation of [Hoare 1973] can be roughly 
 
 described as follows: 
 
 Let Q be a procedure whose correctness with respect to 
 <J> and iJj has been established independently, i.e., 
 
 {<f> I Q I 4>} 
 
 Then to prove {a [call Q| 3) verify the following: 
 
 and 
 
 a |= <J> 
 
 * 1= 3 
 
 i i 
 
 where <j> and ty are obtained from <j>, and ^ with appropriate 
 
 substitutions made for the formal parameters of Q. 
 Clearly, this rule is sufficient to prove {al call Q|$}. But the exit 
 assertion ^ of Q cannot, in general, contain enough information to imply 
 3 when Q is called under different input environments, all of them 
 satisfying <|> . A number of properties guaranteed by a may be unchanged 
 by Q, and hence true upon exiting Q. What is needed is a meta-operator 
 which produces a 3 as the transformations made by Q on a when a implies 
 4> . Such an operator in the context of backward substitution is a "pre- 
 dicate transformer," transforming the given exit assertion 3 of the 
 
 i 
 call Q into a , which is the weakest entry condition to call Q such that 
 
 3 is true if and when call Q returns. 
 
106 
 
 ."'' 
 
 ■C 
 
 Ml 
 
 '» 
 
 
 \ c 
 
 3 
 
 '*:.* 
 
 n 
 
 * 
 
 IK) 
 
 ■ 
 
 
 * 
 
 
 I a 
 
 Is* 
 
 b 
 
 linn 
 
 IS 
 
 The verifier should be given a predicate transformer for each 
 procedure Q which may be invoked under varying circumstances. However, 
 if the procedure Q is not well-written (e.g., global variables were used 
 where local variables should have been used), the predicate transformer 
 will be an overspecifi cation of Q. It should also be realized that 
 some procedures are called only in certain contexts. In such cases, 
 Hoare's rule is simpler to use. 
 
 6.1.3 Proving Certain Properties of Programs 
 
 It is not difficult to invent innocent-looking programs whose 
 correctness is \/ery difficult to establish. Pure and deep mathematical 
 results may be used in the program and hence there may not be a "directly 
 perceivable" relation between what is being computed and the stated in- 
 tentions of the program. 
 
 For example, a depth-first search algorithm [Tarjan 1972] com- 
 putes certain simple functions NUMBER(«) and L0WPT(«) on vertices, and 
 deletes all edges from a stack until a certain condition on NUMBER(«) is 
 satisfied. This property is quite obvious to prove. That this set of 
 edges constitutes a bi connected component of the graph, however, is a 
 difficult theorem. It is interesting to note that this and several 
 other graph algorithms use very simple arithmetic (successor function +1 , 
 and < relation). Habermann [1975] gives another example of an al- 
 gorithm (a quadratic-hash algorithm) whose correctness proof does not 
 readily follow from the program structure itself. 
 
 M 
 
107 
 
 "Existential" properties are also quite difficult to prove 
 using the inductive assertion approach. Consider, for example, an al- 
 gorithm enumerating all circuits of a graph. Its exit assertion is: 
 
 Every subgraph g (of the given graph G) that is output 
 is a circuit of G, and conversely, every circuit of 
 G is output. 
 
 As another example, consider a shortest path algorithm. The 
 exit assertion is: 
 
 The graph G has no path shorter than the one found 
 by the algorithm. 
 
 The path p found by the algorithm often appears explicitly in the al- 
 gorithm, while the set of all paths of G that p -is being compared to does 
 not. 
 
 6.2 Previous Work Related to This Thesis 
 
 In a survey, London [1972] reports that there are more than a 
 dozen verifiers constructed so far, most of these using the inductive 
 assertion method. None of these verifiers can, in general, handle incor- 
 rect programs. Only algorithms that were known to be correct a priori 
 have been mechanically verified with varying degrees of human interven- 
 tion in their proofs. 
 
 We briefly describe two of these verifiers—King's and SRI — 
 which have influenced the verifier presented in this thesis. Other 
 
Era* 
 
 ■ 
 
 108 
 
 i 
 
 r 
 
 -(.Iltj: 
 J ^ S 
 
 1 9 
 
 ' Hilt*" 
 
 1»> 
 
 9 ' ; iTi 
 
 :;as 
 
 il «» 
 
 IBS 
 
 significant verifiers include [Luckham et al. 1973],[Deutsch 1973], 
 [Boyer and Moore 1975], [Good et al. 1975] and [Marmier 1975], Cooper 
 [1975] discusses independently some ideas similar to those expressed in 
 Chapter 3. 
 
 6.2.1 King's Verifier 
 
 King [1969] constructed a verifier which mechanized both the 
 lemma generation, and their proof. A commendable engineering approach 
 was taken in tailoring the theorem prover. The programs, and hence the 
 lemmas, were limited to integer-valued variables, including linear ar- 
 rays. Several ad hoc techniques which depend on the detailed knowledge 
 of integer expressions are used in proving a large class of lemmas 
 about integers. The premise and the negation of the conclusion of the 
 lemma to be proven are represented in a "normal" form, and the resulting 
 set of linear inequalities, and nonlinear equations is algebraically 
 solved [King and Floyd 1972]. 
 
 Among the programs that King's verifier has proven, without 
 any human intervention, are: simple insertion sort, bubble sort, and 
 computing x using the binary representation of y. 
 
 Subsequent verifiers ([Elspas et al . 1973], [Luckham et al . 
 1973], [Good et al. 1975], [Deutsch 1973]) have provided for interac- 
 tion with the user in attempt to prove a much larger class of pro- 
 grams, resulting in the proofs of such programs as Hoare's FIND. 
 
109 
 
 6.2.2 SRI Verifier 
 
 The theorem prover [Elspas et al . 1973] is a collection of 
 inference rules together with a set of strategies. Given the premise 
 of a verification condition to be proven, determining whether it implies 
 the conclusion proceeds in a goal -driven manner. The theorem prover has 
 several high-level inference rules about arrays. Unfortunately, the 
 theorem prover is embedded in a disastrously general QA4 system [Rulifson 
 1972], and lacks a sense of direction. At any given point, several in- 
 ference rules are applicable, and the system applies each one in turn 
 until it succeeds in proving the goal or exhausts all inference rules 
 when, of course, the lemma is false. However, it should be noted that 
 the application of an inference rule may generate further instances of 
 application for another rule, and vice versa, resulting in thrashing. 
 The user may be called upon to provide advice on such and other occasions 
 which can then alter the course of deduction., 
 
 Both King's verifier, and the SRI verifier handle arrays unsat- 
 isfactorily, using the equivalent of access and change functions 
 of McCarthy [1967] because array elements are considered to be of the 
 same type as their indices, and interassignments between them are allowed. 
 
 Our own inference rules about arrays (see Chapter 3) may be 
 considered as refinements of the rules in the SRI verifier. 
 
 6.3 Salient Features of the Sorting Program Verifier 
 
 The verifier presented in this thesis has been designed to 
 meet specific performance requirements. It was to be usable in an 
 
no 
 
 interactive computing system which imposed severe constraints on both 
 the amount of memory and computation time that can be used (see Section 
 5 1). This section briefly analyzes the factors that contributed to 
 the fast decision procedure, and notes some of its shortcomings. 
 
 6.3.1 Decidable 
 
 
 The verifier presented here is unique in that it is the only 
 verifier with a decision procedure for the verification conditions of 
 the programs it accepts to verify. It makes no pretense of being general,, 
 The syntax of the input programs has been carefully designed to reject 
 all programs that the verifier cannot prove or disprove. It provides 
 two basic operations, exchange and insert , to permute the elements of the 
 array, thereby guaranteeing that the elements of the array are conserved. 
 The assertion language is just powerful enough to express all the asser- 
 tions that may be made about sorting-type algorithms. The basic predi- 
 cates provided capture the notion of sequential access in sorting algorithms. 
 
 The decidability is due to such restriction of the lemmas 
 generated, and the partitionability of the sequentially accessed array 
 structure. This results in a canonical representation for each lemma to 
 be proven. The rule of local implication lets us decide if a given pre- 
 dicate is implied by the hypothesis without any search. At no time does 
 our theorem prover need to backtrack or consider various inference rules 
 for their applicability. 
 
 6.3.2 Fast 
 
 The theorem prover is not only a decision procedure, but gives 
 
in 
 
 these decisions rapidly for most theorems encountered in proving sorting 
 algorithms. It should be noted that loop invariants of most algorithms 
 (not necessarily sorting) are conjunctions of predicates. This theorem 
 prover is specially suited to prove such theorems by natural deduction. 
 It might appear that a large number of linear orderings of boundaries 
 will be considered in the proof of a lemma; however, if the algorithm 
 is well -written this is generally not the case. Such lack of information 
 about how the boundaries are ordered is not typical of sorting algorithms, 
 
 Two factors contributing to the speed of the theorem prover 
 are the large inferences made about array segments, without considering 
 their individual elements, and the rule of local implication. 
 
 6.3.3 Backward Function Evaluation 
 
 The backward function evaluation, in the context provided by 
 the ptr expressions which constrain the boundaries of array segments, 
 considerably simplifies a given lemma. This completely eliminates the 
 need for such pseudo-functions as access , and change of McCarthy, used in 
 nearly all other verifiers. It is important to realize that such con- 
 textual evaluation is valid only if assignments among array indices and 
 elements are not permitted. 
 
 6.3.4 Counterexample Generation 
 
 We consider the generation of counterexamples one of the most 
 important duties of a program verifier. If debugging is ever to be 
 replaced by verification, incorrect programs must be handled by verifiers 
 
112 
 
 ,c - 
 
 • « 
 
 ■hi 
 b 
 
 J. IMS' 
 
 }.» 
 i 
 
 C ' ■'> 
 SK 
 
 era 
 jig 
 
 by either suggesting corrective actions, indicating the unproven verifi- 
 cation condition, or actually generating a counterexample for the skeptic, 
 
 As shown in Chapter 3, a modified shortest-path algorithm is 
 the counterexample generator used by this verifier. 
 
 6.3.5 Some Shortcomings 
 
 It is interesting to note that the theorem prover is not goal 
 oriented. Thus, in proving even a trivial theorem such as 
 
 sorted (l,n) f= sorted (l,n) 
 
 it considers two partitions (one for each of the cases n £ and n > 0) 
 of the array. This is typical of decision procedures in that they may 
 ignore shortcuts. However, the strength of our decision procedure is in 
 its orientation toward naturally occurring theorems. 
 
 More seriously, it is hard to generalize the theorem prover. 
 For example, if we permit the predicate that all keys of an array seg- 
 ment are distinct, the theorem prover cannot be extended in a straight- 
 forward manner. 
 
 6.4 Conclusion 
 
 SORTLAB shows that verifiers for programs from a limited do- 
 main of application, which incorporate some of the semantics of the 
 domain, are practical. It would be interesting to see an approach similar 
 to that described in this thesis tried for another domain that is well- 
 understood and easily formalized mathematically. 
 
113 
 
 We believe that such limited program verifiers will be the 
 trend of the future, in the wake of recent results in practical unde- 
 cidability and the lack of progress in mechanical program verification 
 in general „ 
 

 114 
 
 REFERENCES 
 
 
 * 
 
 V> 
 
 ' *i B 
 
 •3,3 
 
 I'J 
 
 a ,iw 
 3>' 
 
 ■ o 
 
 • 2:w 
 
 Bg* 
 
 ; Si 
 
 Sums 
 
 [Bledsoe 1975] 
 
 W. W. Bledsoe* "Non Resolution Theorem Proving," ATP-29, 
 Departments of Mathematics and Computer Sciences, Univer- 
 sity of Texas, Austin, Texas 78712, September 1975. 
 
 [Boyer and Moore 1975] 
 
 R. S. Boyer and J. S. Moore, "Proving Theorems about LISP 
 functions," Journal of ACM 22 (1975), 129-144. 
 
 [Chang and Lee 1973] 
 
 Chin-Lian Chang and Richard Char-Tung Lee, "Symbolic Logic 
 and Mechanical Theorem Proving," Academic Press, New York, 
 1973. 
 
 [Cooper 1975] 
 
 D. C. Cooper, "Proofs about Programs with One-Dimensional 
 Assays," Unpublished manuscript, March 1975. 
 
 [Dahl et al. 1972] 
 
 O.-J. Dahl, E. W. Dijkstra, and C. A. R. Hoare, "Structured 
 Programming," Academic Press, New York, 1972. 
 
 [Daniel son 1975] 
 
 Ronald L. Daniel son, "PATTIE: An Automated Tutor for Top- 
 Down Programming," Ph.D. Thesis, University of Illinois, 
 Urbana, Illinois 61801, October 1975. 
 
 [Deutsch 1973] 
 
 L. Peter Deutsch, "An Interactive Program Verifier," Ph.D. 
 Thesis, University of California, Berkeley, California, 
 May 1973. 
 
 [Dijkstra 1976] 
 
 Edsger W. Dijkstra, "A Discipline of Programming," Prentice- 
 Hall, Englewood Cliffs, New Jersey, 1976. 
 
 [Eland 1975] 
 
 Dave R. Eland, "An Information and Advising System for an 
 Automated Introductory Computer Science Course," Ph.D. Thesis, 
 University of Illinois, Urbana, Illinois 61801, June 1975. 
 
 [Floyd 1964] 
 
 Robert W. Floyd, "Algorithm 245: Treesort 3," Communications 
 of ACM 7 (1964), 701-701. 
 
115 
 
 [Floyd 1967] 
 
 Robert W. Floyd, "Assigning Meanings of Programs," Proceedings 
 of a Symposium on Applied Mathematics , American Mathematical 
 Society 19 (1967), 19-32. 
 
 [Elspas et al . 1973] 
 
 Bernard Elspas, Karl N. Levitt and Richard J. Waldinger, "An 
 Interactive System for the Verification of Computer Programs," 
 Standford Research Institute, SRI Project 1891, Menlo Park, 
 CA 94025, September 1973. 
 
 [Good et al. 1975] 
 
 Donald I. Good, Ralph L. London and W. W. Bledsoe, "An Inter- 
 active Program Verification System," IEEE Transactions on 
 Software Engineering 1 (1975), 59-67. 
 
 [Habermann 1975] 
 
 A. N. Habermann, "The Correctness Proof of a Quadratic-Hash 
 Algorithm," Department of Computer Science, Carnegie-Mellon 
 University, Pittsburg, PA 15213, March 1975. 
 
 [Hansen 1971] 
 
 Wilfred J. Hansen, "Creation of Hierarchic Text with A Computer 
 Display," ANL-7818, Argonne National Laboratory, June 1971. 
 
 [Hoare 1971a] 
 
 C. A. R. Hoare, "Proof of a Program: 
 ACM 14 (1971), 39-45. 
 
 FIND," Communications of 
 
 [Hoare 1971b] 
 
 C. A. R. Hoare, "Procedures and Parameters: An Axiomatic 
 Approach," Proceedings of Symposium of the Semantics of 
 Algorithmic Languages , Lecture Notes in Mathematics 188, 
 Springer Verlag, 1971. 
 
 [Hoare 1972] 
 
 C. A. R. Hoare, "Proof of Correctness of Data Representations," 
 Acta Informatica 1 (1972), 271-281. 
 
 [Luckham et al . 1973] 
 
 David C. Luckham, Friedrich W. vonHenke, Shigerie Igarashi , 
 Ralph L. London and Norihisa Suzuki, "Automatic Program Verifica- 
 tion," STAN-CS-(73-365, 74-473, 74-475, 75-522), Standford 
 University, Standord, California, 1973. 
 
 [King 1969] 
 
 James C. King, "A Program Verifier," Ph.D. Thesis, Carnegie- 
 Mellon University, National Technical Information Service, 
 Springfield, Virginia 22151, #AD 699248, September 1969. 
 
*"» 
 
 3 
 
 J:* 
 I 
 
 SIS 
 
 3 3* 
 giiw' 
 
 ]<a 
 
 i ' '■ 
 11? 
 
 Sinn' 
 
 116 
 
 [King and Floyd 1972] 
 
 James C. King and Robert W. Floyd, "An Interpretation-Oriented 
 Theorem Prover over Integers," Journal of Computer and System 
 Sciences , 6 (1972), 305-323. 
 
 [London, R. L. 1970] 
 
 Ralph L. London, "Certification of Algorithm 245: Treesort 3," 
 Communications of ACM 13 (1970), 371-373. 
 
 [London 1972] 
 
 Ralph L. London, "The Current State of Proving Programs Correct, 1 
 Proceedings of Twenty-fifth Annual ACM Conference , 1972, 39-43. 
 
 [Manna and Pneuli 1974] 
 
 Zohar Manna and Amir Pneuli, "Axiomatic Approach to Total 
 Correctness of Programs," Acta Informatica 3 (1974), 243-263. 
 
 [Marmier 1975] 
 
 Edouard Marmier, "Automatic Verification of PASCAL Programs," 
 Ph.D. Thesis, Swiss Federal Institute of Technology, Zurich, 
 1975. 
 
 [McCarthy 1960] 
 
 John McCarthy, "Recursive Functions of Symbolic Expressions 
 and their Computation by Machine," Communications of ACM 3 
 (1960), 184-195. 
 
 [McCarthy 1963] 
 
 John McCarthy, "A Basis for a Mathematical Theory of Computa- 
 tion," in Computer Programming and Formal Systems, P. Braffort 
 and D. Hirschberg (editors), North-Holland 1963, 33-70. 
 
 [McCarthy and Pointer 1967] 
 
 John McCarthy and J. A. Pointer, "Correctness of a Compiler 
 for Arithmetic Expressions," Proceedings of a Symposium on 
 Applied Mathematics . American Mathematical Society 19 (1967) 
 33-41. 
 
 [Misra 1976] 
 
 Jayadev Misra, Private Communication, 1976. 
 
 [Naur 1966] 
 
 Peter Naur, "Proof of Algorithms by General Snapshots," BIT 
 6 (1966), 310-316. 
 
 [Nievergelt 1975] 
 
 Jurg Nievergelt, "Interactive Systems for Education - the new 
 look of CAI," Proceedings of IFIP World Conference on Computers 
 in Education , North Holland 1975, 465-471. 
 
117 
 
 [Rabin 1974] 
 
 Michael 0. Rabin, "Theoretical Impediments to Artificial 
 Intelligence," Information Processing, 1974, 615-619. 
 
 [Reingold 1973] 
 
 Edward M. Reingold, "A Nonrecursive List Moving Algorithm," 
 Communications of ACM 16 (1973), 305-307. 
 
 [Rulifson et al . 1972] 
 
 J. F. Rulifson, J. A. Derksen and R. J. Waldinger, "QA4: A 
 Procedural Calculus for Intuitive Reasoning," Final Report, 
 SRI Project 8721, Standford Research Institute, Menlo Park, 
 California, 1972. 
 
 [Ruth 1974] 
 
 Gregory R. Ruth, "Intelligent Program Analysis," Massachusetts 
 Institute of Technology, Cambridge, Massachusetts, Manuscript, 
 February 1974. 
 
 [Sherwood 1975] 
 
 Bruce A. Sherwood, "The TUTOR language," Computer-Based Educa- 
 tion Research Laboratory and Department of Physics, University 
 of Illinois, 1975. 
 
 [Tarjan 1972] 
 
 Robert E. Tarjan, "Depth First Search and Linear Graph 
 Algorithms," SIAM Journal on Computing 1 (1972), 146-160. 
 
 [Wagner 1974] 
 
 Robert A. Wagner, "A Simple List Moving Algorithm," Vanderbilt 
 University, Nashville, Tennessee, Manuscript, 1974. 
 
 [Wegbreit 1974] 
 
 Ben Wegbreit, "The Synthesis of Loop Predicates," Communica- 
 tions of ACM 17 (1974), 102-112. 
 
 [Wilcox et al. 1976] 
 
 Thomas R. Wilcox, Alaine Davis and Michael Tindall, "The Design 
 and Implementation of a Table Driven, Interactive Diagnostic 
 Programming System," Communications of ACM 19 (1976), to appear. 
 
118 
 
 
 at 7*. 
 
 APPENDIX 
 Performance of the Verifier - An Example 
 
 The following selection sort program has a weak assertion 
 
 1 procedure sort (n) 
 * TRUE 
 
 scan up with i from 1 to n-1 
 scan up with j from i + 1 to n 
 if xi > xj then 
 
 exchange xi with xj 
 else 
 en di f 
 
 S(1,I) < XI < A(I+1,J) & 1 <_ I < j < 
 ends can 
 
 S(1,I) < A(I+1,N) & 1 < I < N 
 endscan 
 
 * S(1,N) 
 
 10 en dp roc 
 
 The theorem prover disproves the corresponding verification condition, 
 ( subst j-1 for j vn_ 7*) and j < n 
 
 stnts [4. ..7] b (7*) 
 
 in 1114 CPU-milliseconds. When the assertion at 7* is given as: 
 
 S(1,I-1) < A(I,N) & XI <A(I+1,J) & 1 < I < j <_ N 
 
 the program is proven correct in 9346 CPU-milliseconds. 
 
119 
 
 VITA 
 
 Prabhaker Mateti was born in Mahbubabad, Andhra Pradesh, 
 India, on June 18, 1948. He graduated from Osmania University with a 
 B. E. in Electrical Engineering in December 1969. He received his M. Tech, 
 in Electrical Engineering from the Indian Institute of Technology Kanpur 
 in May 1973. He has been a research assistant in the Department of 
 Computer Science from September 1972 to August 1975. He was an Instructor 
 at the University of Texas at Austin from September 1975 to August 1976. 
 
■ 
 
 I. 
 
 «1i 
 
 If 
 
 (P 
 
 ^ 'mm 
 
 5;'p 
 
 2d 
 
BLIOGRAPHIC DATA 
 EET 
 
 1. Report No. 
 
 UIUCDCS-R-76-832 
 
 3. Recipient's Accession No. 
 
 "Title and Subtitle 
 
 An Automatic Verifier for a Class of Sorting Programs 
 
 5. Report Date 
 
 October 1976 
 
 6. 
 
 Author(s) 
 
 Prabhaker Mateti 
 
 8. Performing Organization Rept. 
 
 No - UIUCDCS-R-76-832 
 
 Performing Organization Name and Address 
 
 Department of Computer Science 
 
 University of Illinois at Urbana-Champaign 
 
 Urbana, Illinois 61801 
 
 10. Project/Task/Work Unit No. 
 
 11. Contract /Grant No. 
 
 NSF EC 41511 
 
 Sponsoring Organization Name and Address 
 
 National Science Foundation 
 Washington, D.C. 20550 
 
 13. Type of Report & Period 
 Covered 
 
 Ph.D. Dissertation 
 
 14. 
 
 Supplementary Notes 
 
 . Abstracts 
 
 A decision procedure for the verification conditions generated from a class of 
 
 in-place sorting algorithms is presented. Counter-examples to false verification 
 
 conditions can be generated. The special techniques developed seem applicable to 
 
 a wider class of programs that manipulate data structures. 
 
 A programming laboratory, called SORTLAB, for beginning students has been implemented 
 SORTLAB consists of a program editor, recognizers for the programming and 
 assertion languages tailored for in-place sorting, the program verifier and an 
 interpreter. 
 
 . Key Words and Document Analysis. 17o. Descriptors 
 
 SORTLAB 
 
 Theorem Prover 
 Program Verifiers 
 
 >• Identifiers/Open-Ended Terms 
 
 Is. COSATI Field/Group 
 
 Availability Statement 
 
 Release Unlimited 
 
 19. Security Class (This 
 Report) 
 
 UNCLASSIFIED 
 
 20. Security Class (This 
 
 Page 
 UNCLASSIFIED 
 
 21. No. of Pages 
 
 122 
 
 22. Price 
 
 1 "M NTIS-38 ( 10-70) 
 
 USCOMM-DC 40329-P71 
 

 r 
 
 is 
 19 
 is* 
 
 IS 
 i .a* 
 I 
 
 $■'£! 
 
 go 
 
JAN 2 5 W77 
 
I 
 
 I 
 
 ■1 
 
 $ 
 
 I 
 
 I 
 
 '0 
 > 
 
 ID 
 
 SB 
 
 1 
 
mm 
 
H 
 
 
 5 
 
 5 
 ; ' 
 
 ! 
 
 .1 
 
 > 
 
 in 
 ;: 
 
% 
 
 I 
 
 p 
 
 '8 
 
 I! 
 
 V 
 
■nOHBHIIWfltnHHHHHHUHVMnHMUHnnMMHHitlM'li i * ■■'!'/ 
 
 JM» 1 9 1978 
 
 II 
 
 TtrtTniiiiiiiHHiHiiiMHimimin^ggy^