H 
 
 83 1 
 
 m IB m 
 
 HnHltl 
 
 ■ 
 
 m 
 
 HH 
 
 IBM 
 
 MM 
 
 ■n 
 
 ■ 
 
 BSE 
 
 ■ 
 
 l, < % IBIS] 
 
 811 
 mm 
 
 ■ft 
 
 mi 
 
 H 
 1 
 
 81 
 
 BBfigllfl 
 
 IHHHHHIlnHIMyiiHH 
 
 Bpgwiggy 
 
 Until iffSiflnwu HftSfHUffiE 
 fliHHBHWHHHH!Ba1Ki&tt 
 
 WSKmi 
 
 ■IB^HHIHHI^nnHHHQ80l 
 iiHEI^H8I^Hlin«KE 
 
 BBS 
 
LIBRARY OF THE 
 
 UNIVERSITY OF ILLINOIS 
 
 AT URBANA-CHAMPAICN 
 
 510.84 
 
 iAGt- 
 
 no.5U-5l€> 
 
 f 
 
 cop- 2. 
 
JjikM Report No. UIUC DC S-R -72-516 
 
 //Z4L^C4( 
 
 
 5L- 
 
 TRANSFORMATIONS ON LOOP-FREE PROGRAM SCHEMATA 
 
 by 
 Nurit Bracha 
 
 June 1972 
 
 JUL 5 1972 
 
 UNIVERSITY OF ILL 
 AT^^^Cm.VPA.Gfl 
 
Digitized by the Internet Archive 
 in 2013 
 
 http://archive.org/details/transformationso516brac 
 
Report No. UIUCDCS-R-72-516 
 
 TRANSFORMATIONS ON LOOP-FREE PROGRAM SCHEMATA 
 
 by 
 Nurit Bracha 
 
 June 1972 
 
 Department of Computer Science 
 University of Illinois at Urb ana-Champaign 
 Urbana, Illinois 6l801 
 
 This work was submitted in partial fulfillment of the requirements for the 
 degree of Doctor of Philosophy in Computer Science in the Graduate College 
 of the University of Illinois at Urb ana-Champaign, June 1972. 
 
TRANSFORMATIONS ON LOOP-FREE PROGRAM SCHEMATA 
 
 Nurit Bracha, Ph.D. 
 Department of Computer Science 
 University of Illinois at Urb ana-Champaign, 1972 
 
 A formal theoretical approach toward code optimization is considered. 
 A program schema that models loop-free programs is presented and a complete 
 set of equivalence preserving transformations on loop-free programs is 
 found. A scheme for optimization is provided in which a sequence of these 
 transformations is applied to get an optimal code. These results are 
 extended to the model of loop-free programs which assumes that certain 
 types of algebraic laws hold among the operators, and also to the case in 
 which the tests are Boolean functions of elementary tests. 
 
Ill 
 
 ACKNOWLEDGEMENT 
 
 The author wishes to express her appreciation and gratitude 
 to Professor David E. Muller for his guidance and supervision of this 
 thesis. The author would also like to express thanks to the Department 
 of Computer Science and the Computing Services Office, University of 
 Illinois, for their support of her graduate studies. 
 
 Finally the author wishes to thank her family for their 
 constant encouragement and moral support. 
 
IV 
 
 TABLE OF CONTENTS 
 
 Page 
 
 1. INTRODUCTION 1 
 
 2 THE PROGRAM SCHEMA ^ 
 
 3 EQUIVALENCE OF PROGRAM SCHEMATA 10 
 
 k- . TRANSFORMATIONS ON PROGRAMS 18 
 
 5- OPTIMIZATION 68 
 
 6. ALGEBRAIC TRANSFORMATIONS 78 
 
 7- LOGICAL TRANSFORMATIONS 9! 
 
 8 . OPTIMIZATION UNDER LOGICAL TRANSFORMATIONS 101 
 
 9 . PROGRAM SCHEMATA WHICH ALWAYS HALT 11^ 
 
 REFERENCES 120 
 
 VTTA 121 
 
1. INTRODUCTION 
 
 In many cases it is desirable to have a compiler of a programming 
 language produce an output code which is economical with respect to 
 some cost criterion such as program size or program speed. Optimizing 
 a given program is usually done by an application of transformations 
 to some intermediate language representation of the program. There are 
 many possible transformations that can be applied and compilers which 
 use transformations have been built (7,2). Ideally, a theory of code 
 simplification should provide a machine independent mechanism for re- 
 ducing a given program to an equivalent program which is in a simplest 
 ( in some sense) possible form. 
 
 The existence of algorithms for simplifications is closely connect- 
 ed with the question of the solvability of the equivalence of programs. 
 If the equivalence problem was solvable, then an algorithm for reducing 
 a program to a simplest form would exist in principle. It can be shown 
 that for almost any reasonable notion of equivalence between computer 
 programs, the question of equivalence of pairs of programs is not par- 
 tially decidable (8). There is no effective procedure for determining 
 whether or not two programs are equivalent. Therefore in general, we 
 can not find a finite collection of equivalence preserving transformations 
 so that any pair of equivalent programs can be transformed one into the 
 other by applying a finite sequence of these transformations. 
 
 Despite the undecidability of the theory in general, positive 
 theoretical results can be obtained in many cases and efforts have been 
 
made to isolate subclasses of programs for which the equivalence problem 
 is decidable (8, 9) > and to find complete sets of equivalence preserving 
 transformations which can be applied to decidable subclasses of programs 
 
 (1). 
 
 Aho and Ullman (l) considered a type of program schema that models 
 straight line code. For this case they found a complete set of equiva- 
 lence preserving transformations and showed that these transformations 
 can be applied to get an optimal code. They extended their results to 
 cases in which certain types of algebraic laws are assumed. 
 
 The purpose of this thesis is to consider a program schema that models 
 loop-free programs and to extend Aho's and Ullman' s results to this case. 
 For this subclass of programs the equivalence problem is decidable (9)« 
 
 In Chapter 2 the model of loop-free program schema is presented. 
 A program schema will represent a family of computer programs in the 
 sense that if the operator and test names are given a> particular inter- 
 pretation the schema becomes a program that can be executed by an ideal- 
 ized computer. 
 
 In Chapter 3 the notion of equivalence of loop-free programs is de- 
 fined and in Chapter h- a set of equivalence preserving transformations 
 is presented. This set extends Aho's and Ullman 's set of transformations. 
 Also their representation of programs by directed acyclic graphs is used. 
 The set of transformations presented is shown' to be complete, i.e. two 
 programs are equivalent if and only if they can be transformed one into 
 the other by a sequence of these transformations. 
 
 In Chapter 5 a schema for optimization is provided in which a sequence 
 of the transformations is applied to get an optimal code. 
 
Chapter 6 extends the results of the previous chapters to the model 
 of program schemata that assumes that a set of algebraic laws holds among 
 the operators. It is shown that in this case two programs are equivalent 
 if and only if they can be transformed one into the other by a set of 
 topological and algebraic transformations. It is also shown that certain 
 types of local optimization techniques can be considered as algebraic 
 identities that hold among operators, operands and constants. 
 
 In Chapters 7 and 8 the results are extended to the model of program 
 schemata in which the tests are Boolean functions of elementary tests. 
 It is proved that in this case two programs are equivalent if and only if 
 they can be transformed one into the other by a set of topological and 
 logical transformations. Chapter 8 provides a scheme for optimization 
 that uses topological and logical transformations. 
 
 In Chapter 9 "the subclass of programs which always halt is considered. 
 For this subclass the equivalence problem is decidable (8, 9) although 
 membership in this class is not (8). It is shown that the procedure for 
 optimizing loop-free programs might be used to optimize programs which 
 always halt. Certain transformations that are known to improve the code 
 of programs with loops are shown to be equivalent to sequences of trans- 
 formations on loop-free programs. 
 
2. THE PROGRAM SCHEMA 
 
 Let £ be a countable alphabet of variable names, 
 
 6 a countable set of operator names, 
 
 T a countable set of test names. 
 
 statements are of three types: 
 
 (i) assignment statements 
 
 k A - B. ... B 
 1 r 
 
 A,B 1 ,...,B r e Z 
 
 k is a numeral which is optional, and is the address of the statement. 
 
 9 is an r-ary operator name, 9 € 9. 
 
 The variable A is assigned a new value, which depends on the current 
 
 values of B , . . . ,B and on the unspecified operator 9. 
 
 We say that A is defined by this statement and B , . . . ,B are referenced 
 
 by this statement. 
 
 (ii) test statements 
 
 k t(c 1 ,...,c r ) k^kg 
 
 teT, 0,,...,C e Z 
 1 r 
 
 k,k ,k are numerals, k is optional, k ,k may be equal. 
 
 Control goes to the statement with the prefix le if the 
 
 predicate t(C , ...,C ) is true, otherwise to the statement with the 
 
 prefix kp. 
 
 k n and k^ are called transfer addresses. We say that C,,...,C 
 12 J 1' ' r 
 
 are referenced by this statement. 
 (iii) the statement STOP. 
 
 A Program Schema II is a triple (P,I,U) where P is a finite sequence of 
 statements and I,U are finite sets of variables- input and output respect- 
 ively. 
 
A loop- free program schema is a program schema that does not have loops - 
 i.e. a transfer address of a test statement never references a statement 
 which precedes the test statement in some possible sequence of statements 
 of n. 
 
 We will deal with loop- free program schemata, thus throughout 
 this thesis a program schema should mean a loop-free program schema. 
 
 No transfer address references a numeral which is not a prefix 
 of some statement, and no statement has the same prefix as another 
 statement in the program schema. 
 EXAMPLE 1 
 
 TI =(P,{X,Y},{Z}) 
 P: L *- CPXY 
 
 t(L 1 ) 3,5 
 
 3 Z «- SI^ 
 STOP 
 
 5 L 2 - QL ± 
 Z-^L 2 
 STOP 
 
 cp/^eee X,Y,Z,L r L 2 e S teT 
 
 A program schema represents a family of computer programs. 
 To provide an interpretation for a program schema we choose some finite 
 or infinite set of values (domain), then make an assignment of values 
 from the set to each input variable and assignments of appropriate 
 functions and predicates on the set to the operator and test names of 
 the program schema. Given such an interpretation the schema becomes a 
 program which can be executed by a computer. 
 
An interpretation for the program schema II of example 1 could 
 be as follows: 
 
 The domain set is the set of real numbers. 
 
 The test name t(x) is interpreted as the predicate In(t)(x) 
 which is true if x > and is false otherwise. The interpretation of 
 the operator names <P,¥,0 is as follows: 
 
 In(cpXY) = +XY 
 
 In(¥X) = SQRT X 
 
 in(ex) = -X 
 
 The interpreted program is 
 
 In(P): L - +XY 
 
 >0(\) 3,5 
 3 Z «- SQRT L 
 
 STOP 
 5 L 2 <- -L x 
 
 Z - SQRT L 2 
 STOP 
 Under this interpretation the program schema computes the square 
 root of the absolute value of X+Y. 
 
 Formally, an interpretation In of a program II is a mapping 
 from the set of input variables, operator names and test names of II 
 into a set D and the set of functions and predicates such that 
 (i) each variable Ael is assigned an element In(A)eD. 
 
 (ii) each r-ary operator name G is assigned an r-ary function In(©):D -»D. 
 (iii) each test name t is assigned a predicate on D, In(t) : D — {T,F}. 
 
We shall call program schemata abstract programs or programs . 
 
 An interpretation is called an actual program or a computer program . 
 
 A block is a sequence of assignment statements S n ,...,S n 
 
 1 n-1 
 
 with a STOP or a test statement S , and either 
 
 n 
 
 (i) S follows a test statement 
 
 or 
 (ii) S follows a STOP statement 
 
 or 
 (iii) S is the first statement in P. 
 
 P, the sequence of statements of the program schema can be 
 represented by a directed graph. The graph representing P will be called 
 the graph corresponding to the program schema or simply, the graph of the 
 program schema. 
 
 The graph corresponding to the program II of Example 1 is 
 
 Z <- ^L 
 
 STOP 
 
 L - cpXY 
 
 L 2 - 9L ± 
 Z «- V 
 
 L 2 
 
 STOP 
 
 A directed graph corresponding to a program schema has the 
 following properties : 
 
8 
 
 (i) it is acyclic (i.e. it has no loops). 
 
 (ii) it has only one root. 
 
 (iii) each node can have one or two descendants, or it is a leaf. 
 
 ( iv) each node can have several ancestors. 
 
 We will often represent a program schema II = (P,I,U) by 
 (G , I,U) where G is the directed graph representing P. 
 
 Each path of the graph of TI corresponds to a possible sequence 
 of statements of the program II . 
 
 A path will be called executable if the corresponding sequence 
 of statements can be executed under some interpretation of the program 
 schema. 
 
 A path will be called nonexecutable if under no interpretation 
 (i.e. assignments of input values, functions and predicates), the 
 corresponding sequence of statements can be executed. 
 EXAMPLE 2 
 
 II = (G,{L^Ii 3 ), (L 2 1) 
 
 G: 
 
 STOP 
 
 STOP 
 
The path corresponding to the sequence of statements 
 
 L 2 - FL 3 
 
 t(L 2 ) 
 
 t(L 3 ) 
 
 L 2 «- B Ll 
 
 STOP 
 is nonexecutable. Under no assignment of values to the input variables 
 L, and L, and functions to F, B and a predicate to t this sequence of 
 statements can be executed. 
 
 In the following chapter we shall present a theorem which 
 characterizes executable and nonexecutable paths in terms of the test 
 statements appearing in the paths. 
 
10 
 
 3. EQUIVALENCE OF FROGRAM SCHEMATA 
 
 Two program schemata will be called equivalent iff for every 
 interpretation the output values of the two actual programs are equal. 
 Formally the notion of program equivalence will be defined in the follow- 
 ing way: 
 
 Let L be the finite set of all executable paths of the program 
 n. For each path £eL there is a corresponding sequence of statements 
 
 s. z ,...,s z . 
 
 1 m 
 
 If path i is executed under some interpretation In, val (II ) 
 
 will be defined to be the vector of the output variables after executing 
 
 I I 
 
 S , ...,S , and will be called the value of the program schema under 
 
 interpretation In. 
 
 DEFINITION 
 
 Two programs schemata II and H f will be called equivalent 
 (n=lT) iff for all interpretations In val (II) = val (II 1 ). 
 
 Equivalence of program schemata will be characterized in terms 
 of the sets of terminal expressions computed along executable paths. 
 
 We define the expression v (A), the value of the variable i 
 
 K 
 
 I 
 
 after executing S as follows: 
 
 1) v'(A) = A for all Ael. 
 
 2) If sf is a test or a STOP statement vf (A) = v * (A) . 
 
 k k k-l v 
 
 n 
 
 3) If S is A *- cdb . ..B then V_(A) is the expression 
 
 and for all C, C/A, v^ (C) = vf .(C). 
 
 4) v (A) is undefined otherwise. 
 
11 
 
 The value of the executable path *■ , denoted by v (II ) is 
 
 {v (A), A€U} 
 
 m 
 
 EXAMPLE 3 
 
 If n is of Example 1, 1T= (G , {X,Y} , {Z}) 
 
 L *- cpXY 
 
 STOP 
 
 STOP 
 
 v 1 (n) = v^ 1 (Z) - ¥<pxy 
 
 and I. and I are the left and right paths respectively, then 
 t 
 
 r 
 I I 
 
 v 2 (n) = v c 2 (z) = ^e9XY 
 
 A program IT is said to be proper if V i, I is an executable 
 
 th £ 
 
 path, whenever a variable B is referenced by the k statement of £ (S ) 
 
 then v, , (B) is defined. 
 k-1 
 
 We will deal with proper programs only, thus throughout this 
 thesis, a program schema or a program should mean a proper loop-free 
 program schema. 
 DEFINITION 
 
 Let II and n ' be two programs. Let I and k be two executable 
 paths of n and IP respectively. I and k are said to be consistent iff 
 Vb such that 
 
12 
 
 (i) t is in the sequences of statements that correspond to both I and k, 
 
 so that in path I there is a statement S. of the form t (C , . . . ,C 
 
 )V n 2 
 
 n , 
 
 k k 
 
 and in path k there is a statement S. of the form t (D......D )m, ,nu 
 
 J 1 r 1 2 
 
 and 
 
 2 k 
 
 (ii) v.(C ) = v.(D ) for all q, Kq<r, that is the values of the var- 
 l q J q — - 
 
 iables referenced by the statements are identical, then the statement 
 prefixed by n is included in path I iff the statement prefixed by m. is 
 included in path k, and the statement prefixed by n p is included in path 
 I iff the statement prefixed by m is included in path k. 
 EXAMPLE 5 
 
 K = (G„,{A,B,C,D],{C}) 
 
 C - ^RB 
 
 t,(A) 
 
 - 9 2 CD 
 
 C «- ^ 2 RB C - 0KB 
 
 C *- TKB 
 
13 
 
 IT = (G pl ,{A,B,C,D,},{C)) t (B) 
 
 
 t x (A) 
 
 
 
 t x (A) 
 
 
 
 R *- cp CD 
 
 
 R - cp 2 CD 
 
 R «- ^-jCDy 
 
 
 \? 
 
 - 9 2 CD 
 
 C - ^KB/ 
 
 
 \C «- 9KB 
 
 c «- ^ 2 rb/ 
 
 
 
 \C «- tRB 
 
 STOP 
 
 
 STOP 
 
 STOP 
 
 
 
 STOP 
 
 h. 
 
 
 k 2 
 
 S 
 
 
 
 \ 
 
 The marked paths are consistent. 
 
 The following theorem gives a simple characterization of execu- 
 table and nonexecutable paths in terms of the test statements occuring in 
 the paths. 
 THEOREM 1 
 
 A path £ of a program IT is executable iff there are no two 
 test statements in i that reference variables that have the same values 
 and give different truth values. 
 PROOF: 
 
 1) If there are two test statements that reference variables 
 that have the same values and give defferent truth values, this path can 
 not be execixted under any interpretation, and therefore is nonexecutable. 
 Thus if the path is executable there are no tests with the above property. 
 
 2) Assume there are no two tests with the above property. 
 Then path I will be executed under the following interpretation In: 
 
 The domain set of In is the set of strings of variables and 
 operator names. 
 
1). 
 
 If A is a variable, In(A) = A. 
 
 If cp is a function name, ol , . . . ,a are strings, then In(cpa . . .a ) 
 
 is <Pa, . . .a , the concatenation of cp with <x. , . . . ,a . 
 
 £ 
 If t is a test name in path £, 
 
 1) In(t )(ql , . . . ,a )= T for all (a.,..., a ) such that 
 
 /v 1 n 1 ' n' 
 
 a. = v.(X.), l^.i^.n and t (X,...,X ) = T for some appearance 
 
 £ £ 
 
 S . of t in path £ . 
 
 2) In(t )(a.,...,a ) = F for all (a. , . . . ,a ) such that 
 
 ' 1 n 1' n 
 
 a. - v.(X. ) 1 < i < n and t (X n ,...,X ) = F for some appearance 
 i j i — — 1' n ^^ 
 
 It It 
 
 S. of t in path It. 
 3 
 
 3) In(t )(a, , . . . ,a ) is arbitrary for all other n-tuples 
 
 Because no two tests in I reference variables that have the same values 
 and give different truth values-the definition of the tests is unique, 
 and path It is executed under the interpretation In. Therefore £ is 
 executable. 
 THEOREM i 
 
 II =11' iff for all consistent pairs of paths £,k 
 v £ (H) = v k (ir). 
 
 Theorem 2 states that two programs are equivalent iff the sets 
 of expressions computed along consistent pairs of paths are identical. 
 Thus deciding if two programs are equivalent can be done by checking 
 that terminal expressions are identical for consistent paths. 
 
 Two programs II and H' always have consistent pairs of paths 
 i,k. For each path in one program there is always a consistent path 
 in the other. 
 
 
15 
 
 The proof is based on Luckham, Park and Paterson(8). 
 PROOF: 
 
 l) Let 11= IT, and I and k are consistent. lis II' therefore by 
 
 definition for all interpretations In val T (II) = val „. (II'). Take In 
 
 In In v ' 
 
 as the following interpretation: 
 
 The domain set of In, the interpretation of inputs and functions, 
 are the same as in Theorem 1. The test names are interpreted in the 
 following way: 
 
 If t is a test name in path j = £,k 
 
 1) In(t J )(a n , . . . ,a ) = T for all (a. , . . . ,a ) such that a.=v J (X.) 
 
 In 1' n l m l 
 
 1 < i < n and t (X , . ..,X ) = T for some appearance S of t in 
 paths j=£,k. 
 
 2) In(t J )(a., , . . . ,a ) = F for all (a.,..., a ) such that a.=v J (X.), 
 
 1 n 1' ' n i m x i '' 
 
 1 <: i < n and t (X , ...,X ) = F for some appearance S of t in 
 paths j=i,k. 
 
 3) In (t )(a , . . . ,a ) is arbitrary for all other n - tuples 
 (o^, . • . ,a n ) . 
 
 £,k are consistent, therefore the definition of the tests is unique, 
 val (II) was defined to be the vector of values of the output variables 
 after executing the program under interpretation In. Under the inter- 
 pretation In above path I in II is executed and val T (n) = v (n). Also 
 under this interpretation path k in II ' is executed and val (H ' ) = v (H'). 
 
 Therefore v f (ll) = v k (n ' ) . 
 
 I k 
 
 2) Let v (II) = v (n') for all consistent pairs of paths £,k. 
 
 Let In be any interpretation. We have to show that val (n) = val (II'). 
 
16 
 
 Let i,k be the paths executed under the interpretation In in II and IT' 
 
 I k 
 
 respectively. Then f,k are consistent. Then v (IT ) = v (n'). But 
 
 I k 
 
 if v (n) = v (n') then val (n) = val (n') because if the expressions 
 
 for the output variables are equal, then if we substitute the functions 
 
 and inputs we will get equal values. (We used here the fact that the 
 
 Polish notation representation of expressions is nonambiguous ) . 
 
 EXAMPLE 6 
 
 a) II and II' of Example 5 are equivalent. The consistent pairs 
 
 of paths are I and k , I and k , £, and k , L and k. . 
 
 v (n) = Y cp CDB 
 
 v (II) = ¥ q> CDB 
 
 v "-'(IT) = ecp CDB 
 
 \ 
 
 v (n) = rep CDB 
 
 v (IT) = ^ cp i CDB 
 v -'(IT) = ^cp CDB 
 
 v (n 1 ) = 0cp 2 CDB 
 
 v (n*) = tcp 2 cdb 
 
 Therefore for all consistent pairs of paths the expressions 
 computed are identical. Thus II =11'. 
 
 b) n x = (G p , {A,B],{C}) H^ = (G pl ,{A,B}, {C}) 
 
 C-^AB 
 
 °*1 
 
 C«-<PAB 
 
 STOP 
 
 STOP 
 
 *1 
 
 ^AB 
 
 STOP 
 
17 
 
 In this example the consistent pairs of paths are (< ,k ), 
 
 fig, k^), (l y k 2 ), (t k , 1^). 
 
 V n i- 
 
18 
 
 4. TRANSFORMATIONS ON PROGRAMS 
 
 Let n be a program, and let f be a path of the directed graph 
 
 I I 
 
 with the corresponding sequence of statements S , ...,S . 
 
 I 
 Suppose S. defines the variable A. 
 
 1) If A/U, and S. is the last statement to reference this instance of 
 
 I 
 
 A (i.e. for no k >j S references A, unless A is defined by some 
 
 K 
 
 S j < t < k) then the scope of S in path I is the sequence of state- 
 
 \j J- 
 
 £ I 
 
 ments S. . , . . . ,S.. 
 
 l+l J 
 
 I I 
 
 2) AeU, and for no j > i S. defines A, then the scope of S in path I 
 
 J - 1 - 
 
 i a 
 
 is S. , -. . .,S and the set U. 
 
 i+1' ' n 
 I 
 
 3) If no S., j > i references this instance of A, and A is not an 
 
 J 
 
 I I 
 
 output variable, then the scope of S. is null, and S. is said to be 
 
 useless in path I. 
 
 Any program can be represented by a set of labeled directed 
 
 acyclic graphs (dags). For each executable path i of the program we 
 
 construct a dag D (II) as follows: 
 
 i) For all A, Ael, we create a leaf labeled by A. 
 
 I I 
 
 ii) If S , ...,S is the corresponding sequence of statements, for each 
 
 £ 
 
 j = l,...,n we check if S. is an assignment statement. If yes - we 
 
 I 
 create a node associated with it. If S . is of the form 
 
 J 
 
 A *- 0B_ . . .B 
 
 1 r 
 
 and n_.....n are the nodes associated with the most recent definitions 
 1 r 
 
 of B , ...,B respectively (or if some Bel, rL is the leaf with that 
 label), then the node associated with S. has a label and direct de- 
 
 J 
 
 scendants n_ ,. . . ,n . 
 1 r 
 
19 
 
 iii) We distinguish by circling those nodes associated with the last 
 definitions of the output variables or the nodes associated with defini- 
 tions of variables referenced by test statements. 
 EXAMPLE T 
 
 H = (G,(B 1 ,B 2 ,D),{M)) 
 
 A - cpB 1 B 2 
 
 D(TT), the set of dags of IT, is 
 
 G 
 
 . 
 
 / and k are the left and right paths, respectively. 
 
20 
 
 A transformation on a program is a mapping to the set of pro- 
 grams which preserves program equivalence. We will define a set of trans- 
 formations that operate on loop-free programs. 
 Tl Removal of Useless Assignment Statements 
 
 If for all paths I, such that S. is in the corresponding 
 
 sequence of statements of 1, S. is useless in I, then S. can be removed 
 
 from the program. Also, if Ael, A/U, A is not referenced in IT, A can 
 
 be removed from I. 
 
 If S. is removed, all references to S. (addresses of test state- 
 1 l 
 
 ments) are changed to reference S. . 
 
 Tl operates on all the dages D corresponding to paths I. 
 
 which include the useless statement S.. Tl deletes a node which is not 
 
 l 
 
 distinguished and has no ancestors. 
 
21 
 
 T2 Removal of Redundant Assignment Statements 
 
 S . A «- <PD, . • • D 
 1 1 r 
 
 S . B - cpD n . . . D 
 1 r 
 
 S. C - cpD_ . . .D 
 l 1 r 
 
 S. deleted 
 J 
 
 If S? is A - cpD , . . .D 
 1 r 
 
 S. is B - <PD, . . .D 
 r 
 
 / / 
 
 S. and S. are in tho sequence of statements that correspond to path I, 
 
 2 2 
 
 and there is no path that includes S. and not S., and also for each path 
 
 2 2 2 
 
 that includes both S. and S., D , ...,D are not defined by any S i < k < j, 
 
 l j i r k 
 
 I 
 
 then S. is replaced by 
 
 C - <PD. . . . D , 
 1 r 
 
 t 11 
 
 S. is deleted, and all references to A and B in the scopes of S. and S. 
 J i 
 
 I 
 in all the paths that include S. are changed to references to C. All 
 
 I t 
 
 references to S. are replaced by references to S: , . 
 J J+1 
 
 T2 operates on all the dags D. (IT) corresponding to paths i 
 
 k 
 I I 
 
 which include both S. and S. . All the other dags remain unchanged. T? 
 
 ■*■ J 
 
 corresponds to the merging of two nodes with identical direct descendants. 
 
22 
 
 EXAMPLE 8 
 
 n = (P, {K,X,Y), (B,D)) 
 
 it = (P', {K,X,Y}, {C,D}) 
 
 D +- TDB 
 
 D *- TDC 
 
 "2 ~3 "2 "3 
 
 (£ n ,k_), (j?^,k_), (^,k-,) are the consistent pairs of paths 
 
 1 1 1 ' d d j> j 
 
 l \ k l 
 
 v (n) = {exY, ^KcpxY] = v (it) 
 
 £ 2 k 2 
 
 v (II ) = {cpXY, "KpXY^KCpXY} = v (n ' ) 
 
 s s 
 
 v ^(11) = {CDXY, T^KCpXYCpXY] - v ^(lV) 
 
 T2 transforms II to II ' . 
 
 The sets of dags for II and II' are shown on the next page, 
 
23 
 
 d, (n) 
 i 
 
 d. (n) 
 
 2 
 
 d (n ■ ) 
 
 k i ' 
 
 
 k 3 
 
 T3 Renaming 
 S 
 
 A - cpB n . . . B 
 1 r 
 
 We replace S. by S.' - C *- cpB . . .B , and all references to A in the scopes 
 of S. in all the paths that include S. are replaced by references to C. 
 
 Variable names do not appear in a dag, therefore the dags are not affected, 
 
2k 
 
 Tk Flipping of Assignment Statement: 
 
 i+1 
 
 S. (k) A <- cpB . ..B r 
 S. +1 (k + l) C^^D r ..D q 
 
 A ^ {C,D 1 ,...,D (i } 
 
 C ^ {B i; ...,B r } 
 and there is no path that includes S i+1 and not S ± . Then S ± and S ±+1 may 
 be interchanged. If numerals precede the statements, they are interchanged 
 accordingly. T^ does not affect the dags. 
 T5 Merging of Identical Assignment Statements 
 (a) 
 
 t(C 1 ,...,Cq) 
 
 ) -c P B 1 ...B r 
 
25 
 
 If the statement t(C , ...,C ) k ,k is in paths t and /' i.e. 
 
 and 
 
 and 
 
 S i = S i' = t ( c 1 »---» c q ) k x A ; 
 
 S. 1 - = S.\ = D - <PB_...B 
 l+l i+l 1 r 
 
 and there is no path that includes S. n (s. ,) and does not include 
 
 * i+l i+l 
 
 t(C, , . . . ,C ) , then the statement D <- cpB.. . . .B is moved before t(C, , . . . ,C ) 
 1 q 1 r 1 q 
 
 I I ' 
 
 and S. n , S. n are deleted. The prefix of the statement D - cpB, ...D 
 i+l' i+l ^ 1 r 
 
 is that of t(C n ,...,C ). The prefix of t(C , ...,C ) is deleted, k. ,k. 
 1 q 1 q 1 ^ 
 
 I i ' 
 
 in the test statement are replaced by references to S. p , S. p . 
 
 (b) 
 
 D - CpB . . . B 
 1 r 
 
 cpB n ...B 
 1 r 
 
 If sf = sf = D *■ CB, ...B 
 ii 1 r 
 
 t t' 
 
 and S., S. are the last assignment statements before / aid /' -re merged, 
 
 If if 
 
 then S. and S. can be merged. All references to S. , S. are replaced by- 
 references to the merged statement. T5 does not affect the dags. 
 
26 
 
 EXAMPLE 9 
 
 
 
 
 It .1 (P,(A,B), 
 
 (L)) 
 
 
 it = (p',(a,b), CD) 
 
 P = t x (A) 2,5 
 
 
 
 P« = t x (A) 2,5 
 
 2 N *- CPAB 
 
 
 
 2 N - CpAB 
 
 L *- ^NN 
 
 
 
 t(B) 7,7 
 
 t(B) 7,7 
 
 
 
 5 N - SAB 
 
 5 N *- GAB 
 
 
 
 7 L - ¥NN 
 
 L ♦- ^NN 
 
 
 
 L - TLL 
 
 7 L - TLL 
 
 
 
 STOP 
 
 STOP 
 
 
 
 
 T5 transforms II to II ' . 
 
 
 
 
 t ^ A) 
 
 
 
 tjA) 
 
 N - CpAB/ \ N *- SAB 
 
 
 
 N - CpAB/ \N ♦- 0AB 
 
 l «- fm/ \L «- ^NN 
 
 
 
 
 t(BJ_ \ 
 
 
 — ■> 
 
 t(4__\ 
 
 TLL 
 
 STOP 
 
 L - ^NN 
 L - TLL 
 
 STOP 
 
 IT 
 
 IT 
 
 To Removal of Nonexecutable Paths 
 (a) If S;? = t(A 1 ,...,A r ) k^kg 
 
 S^ = tCc^...,^) k y k 4 
 
 j>i 
 
 (*) and there is no path i that includes S. and not S. 
 (**)and Vk, 1 < k < r 
 
 w ■ v ( V 
 
 ! 
 
27 
 
 then 
 
 / t I 
 
 (i) if S. is on the right branch of S., S. is changed to t(C , ...,C )k, ,k, , 
 
 I 11 
 
 (ii)if S. is on the left branch of S., S. is changed to t(C-,...,C )k,,k,. 
 i j 1 r' 3 3 
 
 tC^,...,^) 
 
 tCc^...,^) 
 
 t^,. ^) 
 
 t(c 1 ,...,c r ) 
 
 and 
 
 (b) If S* = t(A l ,...A r )k 1 ,k 2; 
 
 (i) s^ = t(c 1 ,...,c r )k 3 ,k i| 
 
 and (*) and (**) hold for sf and S* then we change sf to t(A_,...,A )k ,k 
 
 and S, is deleted. 
 *1 
 
 (ii) s' = tC^,...,^)^,^ 
 
 and (*) and (**) hold for sf and S* , we change sf to t(A. ,...,A )k , ,k, 
 
 l k ? i 1 r 1 4 
 
 i 
 and S. is deleted. 
 
 k 2 
 
 t(A 1 ,...,A r ) 
 
 t(A 1 ,...,A r ) 
 
 t(c lt ...,c r ) 
 
28 
 
 Since T6 gets rid of tests, it uncircles nodes in the dags 
 
 that are associated only with variables referenced by the tests removed. 
 
 T7 Removal of Unconditional Useless Test statements 
 
 If S. is of the form t(C,,...,C )k. ,k n and le, = 1+1. then S, 
 1 1 n 1 1 1 i 
 
 is removed. All references to S. are changed to references to S. ^ . 
 
 Since T7 removes tests, it uncircles nodes in the dags that 
 
 are associated only with variables referenced by the tests removed. 
 
 T8 Flipping of Tests 
 
 I 
 If S. = t (A_,...,A ) k ,k 2 and S. is on both paths I and 
 
 I V 
 
 I , s. = s . , , 
 
 and S^ = t £ ,(B 1 ,...,B ) kj,^ 
 andS kj ■ W" 'V k 5' k 6 
 
 then t_ and t_ can be flipped so that S. = S. , = t„(B n ,...,B ) k n ,k^ 
 1 2 KB i i '2 1 ' ' q' 1' 2 
 
 \(A 1 ,... > A r ) k^k. 
 
 k i V B i'-" B q )k 3' k * k 2 V V"'VV* 
 
 Vi 
 
29 
 
 t 2 ( Bl ,....B ) k i; k 2 
 
 
 > sequences of statements for the same combinations of tests 
 are not changed, therefore the transformation does not affect the set 
 of dags . 
 T9 Removal of Unreferenced Blocks 
 
 If the first statement of a block is not referenced (i.e. either 
 the numeral is not a transfer address of any of the statements of P, 
 or the first statement of the block is not prefixed by a numeral and the 
 block is not the first in the program) then the whole block is removed. 
 
 If the first statement of a block is not referenced, there is 
 no executable sequence of statements that contains the block, thus T9 
 does not affect the dags. 
 
 An example of the use of T9 is removing a useless STOP state- 
 
 ment 
 
 If S. is STOP 
 
 1 
 
 and 
 
 S. , is STOP 
 l+l 
 
 and there is no path that includes S. , and does not include S., then 
 ^ i+1 i 
 
 T9 can be used to remove S. 1 . 
 
30 
 
 EXAMPLE 10 
 
 path. 
 
 IT is of Example 2. We would like to remove the nonexecutable 
 
 n 
 
 L 2 . FL 3 
 
 t(L 2 )3,5 
 
 STOP 
 
 STOP 
 
 IT 
 
 t(L 2 )3,5 
 
 L 2- F ^ 
 
 STOP 
 
 IT = (Mlyl^}, {L 2 }) 
 
 = L 2- FL 5 
 
 
 
 L, - FL 3 
 
 
 
 L, - FL 3 
 
 
 t(L 2 ) 3,5 
 
 
 
 t(L 2 ) 3,5 
 
 
 
 t(L 2 ) 3,5 
 
 
 3 L 2 - DL X 
 
 
 3 
 
 L 2 . D L]L 
 
 
 3 
 
 L 2 - DLl 
 
 
 STOP 
 
 t6 
 > 
 
 
 STOP 
 
 T9 
 > 
 
 
 STOP 
 
 T7 
 
 
 ; 
 
 5 Lj - FL 
 
 
 5 
 
 Lj - FL ; 
 
 
 5 
 
 L 3^ 
 
 
 t(L 3 ) 7,9 
 
 
 
 t(L ? ) 9,9 
 
 
 
 tfL ) 9,9 
 3 
 
 
 T L 2 - BL X 
 
 
 7 
 
 L 2 * B Ll 
 
 
 9 
 
 L 2- CL 1 
 
 
 STOP 
 
 
 
 STOP 
 
 
 
 STOP 
 
 
 9 l 2 - C^ 
 
 
 9 
 
 L 2~ 0L L 
 
 
 
 
 
 STOP 
 
 
 
 STOP 
 
 
 
 
 
 n 
 
 
 
 
 
 
 
 
31 
 
 t(I^) 5,5 
 
 3 h - DL 1 
 
 STOP 
 
 5 Lj - FLj 
 9 Lg - CL X 
 STOP 
 
 n' 
 
 TIO Flipping of Blocks 
 
 Any two blocks 'except the first) can be flipped. The graph of 
 the program is not changed, and the dags are not affected. 
 Til Merging of Identical Subgraphs 
 a) if S* = tC^,...^) k x ,k 2 
 
 and Vk, Kk<r v?(A, ) = v* (C. ) 
 
 — - l k j k 
 
 and S., S. are the roots of identical subgraphs D. , D., respectively, 
 
 then D. and D. are merged, S. is deleted, and all references to state- 
 i j J 
 
 ments in D. are changed to references to statements in D. . 
 
 Til 
 
 t(c 1 ,...,c r ) 
 
 t(A 1 ,...,A r ) 
 
 t(A 1 ,...,A r ) 
 
32 
 
 The executable paths are not affected by this transformation there- 
 fore the dags are not affected. 
 
 b) 
 
 Also, if the subgraphs D. and D. are identical, and also the se- 
 
 i I £ ' f ' 
 
 quences of assignment statements S , ...,S and S , ...,S preceding 
 
 n 
 
 Q 
 
 D. and D. respectively, are identical, then a test statement is added to 
 the program and the sequences of assignment statements together with the 
 subgraphs are merged. 
 
 
 A trivial case of Tll(b) which will be used later is 
 
33 
 
 Hero D_ L and D -re empty, and the identical sequences of statements 
 are merped. 
 
 ••■••••.•:. " 
 
 Let II and II' be the programs of Example 6(b). We would like to 
 show that II can be transformed to 11' by applying T5, T7, T8, Til. 
 
 T8 
 
 V A A) hW 
 
 X - ^AB C-tAB/ 
 
 STOP STOP 
 
 .:t . 
 
 Ma) 
 
 - • 
 
 STOP 
 
3h 
 
 b,(A) 
 
 ♦- ^AB 
 
 stop 
 
 C - cpAB/ \C - TAB 
 
 STOP 
 
 STOP 
 
 STOP 
 
 t,(A) 
 
 C - cpAB 
 
 STOP 
 
 C «- ^AB 
 
 STOP 
 
 T7 
 
 t,(A) 
 
 C -^AB 
 
 STOP 
 
 C *- CpAB 
 
 STOP 
 
 STOP 
 
35 
 
 All the transformations presented above are equivalence preserving. 
 This is clear from the definitions. 
 
 We denote II — r-> II if transformation Ti transforms II to II ' . 
 
 1 
 
 i = 1,2,3,.. .,11. 
 
 We define . ^ to he the reverse of . s > . 
 
 i i 
 
 We say 71 > where S c (1,2,..., 11} if there is a sequence of 
 
 programs IT. , . . . ,11 II _= II ,11 = H' and for all i II. . > II . , or 
 1 m 1 m i .1 i+l 
 
 H. =t=^ II. ., jeS. 
 
 i+l 
 
 DEFINITION 
 
 A set of transformations J is defined to be complete iff 
 
 n sir =^ n ==^ it 
 
 :".'-' . '~' 
 
 If the graph corresponding to a program II has nonexecutable paths, 
 then II can be transformed to an equivalent program II 1 in which each path 
 is executable, by applying a sequence of transformations from the set 
 S = (T6, Til}. 
 PROOF: 
 
 If path i is nonexecutable, by Theorem 1 there must be at least 
 two tests in path I that check the same values and give different truth 
 values. 
 
 Let S.f be t(A , . . . ,A ) and S* be t(C ,...,C ) j > i and 
 
 X _|_ X J A. ± 
 
 v l (\) = v ^( c k ) 1 < k £ T > and t ( k 1 > . ..,A ) = false, t(C , . . . ,C ) = true. 
 
36 
 
 The following cases are possible: 
 
 iJtC^ 
 
 s t ( c 1 j^-^ c r ) 
 
 £ £ 
 
 S. and S. 
 
 (1) £ is the only path that includes 
 
 In this case conditions 
 (*) and (**) of T6 hold, and T6 can 
 be applied to get rid of the non- 
 executable path. 
 
 (2) S. is on both branches of S., and there is no path that includes S. 
 J <J 
 
 t 
 
 and not S. . In this case we operate Til in reverse and then To on both 
 branches . 
 
 R 
 
 {11} 
 
 {6} 
 
37 
 
 (3) There is at least one path that includes S. and not S. . In this 
 case we again use Til in reverse, and then apply T6 on the left subtree 
 
 {11} 
 
 We may conclude from Theorem 5 that any program that has n 
 loops and contains nonexecutable paths can be transformed to an equiv- 
 alent program such that each path of its graph is executable. The 
 transformations T6 and Til are applied as many times as necessary. 
 
 Lemma 1 
 
 If TI 
 
 3 
 
 n ' then n 
 
 (1,2] 
 
 * IP. 
 
 PROOF: 
 
 Assume S. = A «- cpB. . . .B . 
 
 l 1 r 
 
 Then H =r^ n 1 means that in H* S. 
 5 i 
 
 is replaced by S.' = C 
 
 CDB.....B and all references to A in the scopes 
 1 r 
 
 of S. in all the paths that include S. are replaced by references to C. 
 
 1 R 
 
 Therefore TI ==> IT ===> TI 1 
 2 11 
 
 where in TI S. is replaced by S,S! 
 
 S = D - cpB. . . . B 
 1 
 
 S! = C - <PB n ... 
 
 l 1 
 
38 
 
 and all references to A in the scopes of S. in all the paths that 
 include S. are replaced by references to C. IL =^> II therefore 
 n ===> II . S is useless in all the paths that include S, therefore 
 
 n 1 T => n». 
 
 Lemma 2 
 
 If n 
 PROOF 
 
 n' then n 
 
 TTJ) 
 
 h n 
 
 T* n i 
 
 II ' where in IT_ we insert S, n 
 1 1+1 
 
 between S. ., and S. , 
 l-l l 
 
 Tlic two statements S. are redundant, therefore we can apply 
 
 T2 and get IT' 
 
 Lemma 3 
 
 If n ===^ n ' , then n TT?) n ' 
 
 PROOF: 
 
 ^ n 
 
 R x 
 
 We insert S. before t, which is useless in all the 
 i 
 
 paths that contain it. i 
 
39 
 
 The:: wo apply TZ twice and we cet IT' 
 
 Lemma - 
 PROOF: 
 
ko 
 
 We operate T6 together with T7 four times 
 
 T6,T7 
 
 Wo will define an enumeration of the paths of a graph correspond- 
 ing to a program schema. Nonexecutable paths will not be enumerated. The 
 definition of the enumeration will be stated recursively, as follows. 
 
 Definition of enumeration 
 
 1) look at the root v. , if it is a leaf 
 
 v. to the enumeration. 
 
 1 
 
 2) otherwise 
 
 enumerate the left subgraph 
 enumerate the right subgraph, 
 nonexecutable paths are not enumerated. 
 
 - add the path terminated by 
 
 P 
 
1*1 
 
 EXAMPLE 12 
 
 The enumeration of the paths (assuming all are executable) 
 will be as follows: 
 ( VV v 2 ) 
 
 (v ,v r v 5 ) 
 
 (v ,v v v 6 ,v 10 ,v 12 ,v 9 ) 
 
 Let II and II' be two programs in which each path is executable, 
 with the enumerations I ,...,l and k , . . . ,k respectively. The enumeration 
 induces a 1-1 correspondence between the paths of n and H ' : 
 
 1 1' ' n n 
 
k2 
 
 Lemma [ 
 
 If IT and II ' are two programs in which corresponding paths have the same 
 
 tests appearing in the same order, then corresponding paths are consistent. 
 
 PROOF: 
 
 We will first show that the trees T and T" of IT and II' respec- 
 tively are similar, i.e. they have the same structure. By Knuth (6) 
 similarity can be proved by showing that there is a one-to-one correspon- 
 dence between the nodes of the two trees which preserves the structure, 
 so that if nodes u, and u in T correspond respectively to nodes u.. ' and 
 u ' in T" , then u is in the left subtree of u iff u ' is in the left 
 subtree of u ' , and the same holds for right subtrees. 
 
 Let u,,...,u and v n , . . . ,v , be the enumerations of the nodes of 
 1 n 1 n 
 
 T and T' respectively (with repetitions) induced by the enumerations of 
 
 the paths. Since corresponding paths have the same tests n=n'. We will 
 
 show by induction that u. , is the left descendant of u. iff v. , is 
 
 l+l i i+I 
 
 the left descendant of v. and u. . is the right descendant of u. iff 
 
 i i+I l 
 
 v. _, is the right descendant of v. . 
 i+I to i 
 
 The case i=l is trivial. 
 
 Assume we proved for l,2,...,i. Then the subtrees which include 
 
 the nodes u_,...,u. and v..,..., v. are similar. 
 1 i 1 i 
 
 Let u be the first node for which the theorem has not been 
 proven. There are two possible cases: 
 
 l) u. is the left descendant of u. . Since all corresponding paths have 
 the same length, v. has to be on the same path as v. , and since the 
 left descendant is the next node to be enumerated, v. _ must be the left 
 descendant of v. . 
 
^3 
 
 2) u. - is the right descendant of u. . Since corresponding paths have 
 
 the same length v. and v. . are on the same path in T' , thus v. , is 
 i l+l l+l 
 
 a descendant of v. . It can not be the left descendant because we assumed 
 
 i 
 
 that the subtrees which include u, ,...,u. and v, , . . . ,v. are similar. 
 
 1 i 1 l 
 
 Therefore v. , is the right descendant of v. . 
 l+l l 
 
 Since the trees have the same structure and corresponding 
 paths have the same tests appearing in the same order, it is clear that 
 corresponding paths are consistent. 
 
 THEOREM k 
 
 Let J! and n ' be two programs in which each path is executable 
 
 with the enumerations k, • . • . ,k and k' ,...,k' , respectively. Then 
 
 1 n 1 ' m 
 
 using the transformations T1,T5,T7,T8 and Til, n and IT' can be trans- 
 formed to equivalent programs ¥ and ¥' respectively with the enumerations 
 
 I. ..... I and I' ..... l' in which each path is executable, such that cor- 
 
 1 q 1 q 
 
 responding paths in ^ and ^' are consistent and have the same tests. 
 
 PROOF: 
 
 We will assume that the graphs of IT and IT * are trees. This 
 assumption does not cause any difficulty because Til may be used in reverse 
 as many times as necessary to transform the graphs to trees. 
 
 We will construct two sequences of programs ¥,¥,...,¥ 
 and *', *£,...,*■ such that t Q . n, !■ - n',f. (f^f * 1+1 
 
 f! , , r\, y y\ . and for all i, Ki<q, the first corresponding i 
 l 11,5, ( ,o j l+l 
 
 paths of ¥. and ^! are consistent. Then ¥ and V I" will be the programs 
 
 ¥ and ¥' of the theorem. 
 
kU 
 
 Throughout this proof identical tests checking different 
 values are considered as different tests. 
 
 By Lemma 5, if corresponding paths have the same tests appear- 
 ing in the same order, then corresponding paths are consistent. So in 
 each stage of the process we apply transformations T1,T5,T7,T8 so that 
 corresponding paths will have the same tests appearing in the same order. 
 Useless unconditional tests will be eliminated by T7 before applying 
 the above transformations. 
 
 ^ and ¥' will be constructed as follows: 
 
 First we eliminate all useless unconditional tests by applying 
 T7- Then 
 
 1) If the tests in k and k' are t, , . . . ,t and they appear in 
 the same order in both paths, then ^ - II and ^' = II 1 . 
 
 2) If the tests in both k and k' appear in the same order, 
 
 but there is a test t. in L that does not appear in k ' , we use Tl 
 
 j 1 1' 
 
 reversed as many times' as necessary to insert statements such that 
 
 v (A) = v (A), and then we use the reverse of T7, Til to insert t.(A) 
 
 J 
 
 in k' . We get an equivalent program. 
 
 T1,T7 
 
 ,T11 
 
 
 
 
 4* 
 
 / 
 / 
 / 
 / 
 / 
 t. 
 
 
 
 / 
 
 s 
 
 
 
 / 
 
 N 
 
 
 
 / 
 
 \ 
 
 
 
 / 
 
 
 
 
 / 
 
 \ 
 
 
 
 
 
 \ 
 
 *,' 
 
 
 
 S 
 
 n* 
 
U5 
 
 We operate the reverse of T1,T7 and Til as many times as nec- 
 essary on both II and IT 1 so that both paths will have the same tests. 
 
 3) Assume the tests appear in a different order and let t. and t. be 
 two adjacent tests appearing in a different order in n and H ' . 
 
 n 
 
 IT 
 
 We will use the reverse of T5 as many times as necessary to move 
 the statements S , . ..,S after t. so that T8 can be operated. 
 
 (a) if t. appears on the right branch of t . in II * right after 
 
 J 
 
 t . we will operate T8 on . 
 
 J 
 
 T8 
 
 To oY ave . and t. in the sane order as in II . 
 
U6 
 
 (b) Otherwise 
 
 we will operate the reverse of T7 to insert t. on the right branch of t. 
 
 (if v is a STOP statement, we will insert t. before v ), and then use 
 n ' 1 n ' 
 
 the reverse of T5 as many times as necessary to move all the assignment 
 statement after t. . Now T8 can be applied. 
 
 (c) If t. appears on the right branch of t. but there are 
 
 — J 
 
 test statements between t. and t . we will use T7 to insert t. on the 
 
 1 -J x 
 
 right branch of t, and then use T8. 
 Now we apply the same procedure as 
 in case (a). 
 
 The above process will be applied to each pair of t. and t. 
 that appear in a different order in the two paths. 
 
 II 
 
 {1 
 
 ,5,7,tfV ' 
 
 y. 
 
 n 
 
 1 N fi 
 
 {l,5,7,»J 1 
 
1*7 
 
 h) The same as in 3) but t. and t. are not adjacent in IT'. 
 
 / 
 t 
 
 t ,...,t 
 
 1 n 
 
 separate t . 
 
 and t . , 
 
 l 
 
 n 
 
 n* 
 
 In this case we will use T7 to insert t. on the right branch 
 
 of t and then apply T8. We might have to use T5 to move assignment 
 n 
 
 statements on "both branches of t so that T8 can be applied. We will 
 
 n 
 use the same procedure as in case 3) • 
 
 We will repeat this process for all t , . . . ,t 
 
 v n v l 
 
1*8 
 
 t. 
 ■i 
 / 
 
 IT 
 
 R 
 
 IT' 
 
 (5 
 
 TW^i 
 
 We 
 
 conclude from l)-+) that H f n " a J ^n n' c " fl ) YJ 
 
 (l,5,7,o) 1 (l,5,7,bj 1 
 
 Assume that we have constructed 1^. . . . ,f. and ¥'...,¥! 
 
 such that for the first i paths of ^ . and Y! - corresponding paths are 
 
 consistent. We would like to construct ¥. , and ^. '. 
 
 i+l i+I 
 
 Let the tests of I . , and £. . be t n ,...,t and t ',..., t' 
 
 i+l i+l 1 r 1 q 
 
 respectively. The cases l),2) are as before. For 3)>M assume first 
 that t. is on the left branch of t. in ¥., 
 
 ¥, 
 
 then proof is as in 3) above. 
 
U9 
 
 If t . is on the right branch of t. 
 J i 
 
 i+1 
 
 ¥, 
 
 i+1 
 
 f . 
 
 we observe that this case is impossible because /. and £.' include the 
 
 11 
 
 same tests appearing in the same order. 
 k) is proved in the same way. 
 temark : We observe that in ¥ and ¥' no path has useless unconditional 
 tests. 
 
 THEOREM 5 
 
 Let H and JI 1 be two programs with the enumerations I,,.. 
 
 and k.,...,k respectively, in which 
 i) each path is executable. 
 
 ii) corresponding paths are consistent and have the same tests, 
 iii) all the statements of the programs appear on the graph (i.e. 
 there are no unreferenced block 
 iv) no path has useless unconditional tests. 
 Then 
 
 v i, i<i<„, h (n)- h {n<) iff n ^^^ ' . 
 
 ,1 
 
50 
 
 D^n) denotes the dag Dj.00 1 < i < n, and D^TT') denotes 
 
 the dag D^ (IT), 1 < i < n. 
 i 
 
 PROOF : 
 
 1) The "if" part is simple and was discussed when we 
 listed the transformations. 
 
 2) We will show that if £,,...,£ are the paths of IT, 
 
 k , . . . ,k are the paths of V , and V 1 < i < n D. (n) = D. (IT) then 
 
 n y ip. 
 {3,4,5,10,11} 
 
 If the dags are identical then the input sets of IT and IT' 
 
 are identical because the inputs are the leaves of the dags. 
 I. t. 
 
 Let S n x S be the sequence of assignment statements of path £. 
 
 1 ' ' XLa i 
 
 k i k. 
 in program IT. T, ,...,T n x is the sequence of assignment statements 
 
 1 ' i 
 of path kj in program IT'. The two sequences have the same length 
 
 because each statement corresponds to a node of D. , and we assumed 
 
 that the dags are identical. 
 
 Renumber the assignment statements S, ,...,S of IT by 
 
 eliminating repetitions from the sequence S , . . . ,S r ,. . , S, , . . . ,S r 
 
 We shall define inductively a sequence of programs JT n , 11-1, . . . ,H such 
 
 # 
 that IL, = n' tt = n and IT. > • II . , . Assume inductively 
 
 ° ' *■» J {3,^,5,10,11} J +1 
 
 that 1) n' > IT., 2) corresponding paths in T and IT. 
 
 {3,^,5,10,11} J J 
 
 are consistent and have the same tests, 3) If U? , lT;,...,Lr are the 
 assignment statements of IT. numbered without repetitions using the 
 same method as was used to number the statements S,,...,S of E, then 
 
 u l = s l> u 2 = s 2 >---' u j = s y 
 
 
51 
 
 Clearly the assumptions 1-3 are valid for j=0. 
 
 Let TT. be given and assume S. , is A *- cp A, . . . A, and lies 
 J 0+1 1 k 
 
 on path /.. Because D. (IT.) = D. (Tl) there is an assignment statement 
 
 U J = B «- cp A., . . . A that has the same node in D. (n.) that S. , has in 
 v 1 x i J J+l 
 
 D. (n). We see that v>j because Ir is not in the sequence U^,...,U.. 
 None of the A,,..., A, is defined by U J . .,..., JJr because otherwise 
 D. (n) would not be equal to D. (JI.). 
 
 Let S be the last statement in path i. for which the 
 corresponding statement in H. appears in the same position. Since the 
 
 J 
 
 enumeration is without repetitions S may be different from S.. 
 We will distinguish between two cases: 
 
 (a) in TI.S. .. is the next statement after S^. 
 
 9 J+l r 
 
 (b) S . n is the next assignment statement but there are 
 
 J+l 
 
 one or more test statements between S_ and S.,,. 
 
 r j+l 
 
 S., might be null in the initial step of II, =71'. 
 
 (a) S.,, follows S in II. 
 ' .1+ r 
 
 The following cases are possible for IT. 
 
 l) There is no test between IT = S and IT In II., and there 
 
 ' r r v y 
 
 is no path that includes U J but not U . 
 
 v r 
 
 We can apply TU as many times as necessary 
 to move U right after S r . T^ can be 
 applied because we proved that none of 
 ^....A^is defined by U"j +1 ,... ,xfi . 
 
52 
 
 By T3 we change B to A. We get a program II. , II. > n. 
 
 2) In IT. there is one or more tests between S = IT and 
 3 r r 
 
 Lr, but there is no path that includes lr 1 but not U J . 
 
 v r 
 
 (i) If U J is on the left path of the 
 
 last test that separates U J and Lr"(£.) 
 
 r vi 
 
 then it must be also on the right path 
 of this test (i ) because the corresponding 
 path of £ in II contains S . n and the dags 
 of II and II . are equal. By applications 
 
 J 
 
 of Tk and T5 several times we can move lr 
 
 v 
 
 to be right after U , using the argument of 
 l). By T3 we change B to A. 
 
 (ii) If U is on the right path of the last test that separates U 
 and lr , then by the above argument there must be a statement Lr = lr 
 on the left path of that test. But because the left path precedes the 
 right path in the enumeration and lr = S ,...,lr = S , this case is 
 
 impossible. 
 
 3) There is a path that contains U and does not contain 
 
 S , and there is no test between S and U . 
 t' r v 
 
 (i) If S is on the path left of the 
 merging point, we use Th to move U^ to 
 the merging point and then use the reverse 
 of T5, and again Tk. Increasing the number 
 
53 
 
 of statements by using the reverse of T5 does not cause any difficulty, 
 because the number of statements along all the paths is equal in II and 
 
 Ik) 
 
 (ii) The case in which 3 is on the 
 
 r 
 
 path right of the merging point is 
 impossible because we proved already for 
 the left path i ' , therefore IT has the 
 form 
 
 9B....B, 
 
 cp B, . . . B 
 1 r 
 
 
 and there must be another statement Lr in U . that gives rise to a node 
 
 * J 
 
 labeled by cp with descendants B, ,...,B . This statement must be 
 
 before the merging point, otherwise we apply the same argument. We 
 
 use the same procedure as in l). 
 
 (b) In II there are one or more test statements between S and S. , . 
 
 r j+l 
 
 Assume there is one test t between S 
 
 r 
 
 and S. . . Because we assumed that 
 corresponding paths in IT * and II. are 
 
 J 
 
 consistent and have the same tests, 
 
5U 
 
 and also corresponding paths in II 1 and II are consistent and have the 
 same tests, also corresponding paths in IT and IT. are consistent and 
 
 J 
 
 have the same tests. Therefore in n. there must be a test statement 
 
 3 
 
 t checking the same values and there is no test statement between 
 
 U J = S and t. 
 r r 
 
 Vie use T5 reverse as many times as 
 necessary, so in II. , t will appear 
 right after S . 
 
 R 
 
 {5} 
 
 Then we proceed with similar arguments to those in case (a). We use 
 
 similar arguments when there is more than one test between S,, and S.,,. 
 
 r j+i 
 
 We might conclude from (a) and (b) that the desired program 
 II. .. is obtained. In each stage we might have to flip blocks by T10 
 
 so that the blocks will appear in the same order in IT and II 
 
 j+l' 
 
 n 
 
 ¥ n. 
 
 J {3,^,5,10,11} J+i 
 
 thus IT ' 
 
 P- IT. 
 
 {5^,5,10,11} J' + l 
 
 and because none of these transformations operate on tests, correspond- 
 ing paths in H' and IT. , are consistent and have the same tests. 
 DEFINITION 
 
 A program is reduced if no executable path of the program has 
 useless or redundant statements and the program has no unreferenced 
 blocks. 
 
55 
 
 The dags of a reduced program have the following properties: 
 
 1) No dag in D(ll) has roots which are not distinguished. 
 
 2) No two nodes in a dag have identical direct descendants. 
 THEOREM ( 
 
 Any program II can be transformed to an equivalent reduced 
 program IT' using no larger set of transformations than ( 1,2,'+, 5, 9, 11}. 
 
 The following example shows how the transformations are used 
 to reduce a given program. 
 EXAMPLE 13 
 
 Let II be TI = (G , {B,C}, {L,N}) 
 
 STOP 
 
 STOP 
 
 The statement N *■ cpAL is useless in the right path, 
 remove the statement we first use the reverse of T5 
 
 To 
 
 R 
 (5^ 
 
 N «- cpAL 
 L - CpAN 
 IT •- q)BC 
 
 STOP 
 
 A «- qpBC 
 L - cpAC 
 
 t(A) 
 
 STOP 
 
56 
 
 now Tl might be used to remove the useless statement 
 
 A «- cpBC 
 L «- cpAC 
 
 CD 
 
 t(A) 
 
 N «- cpAL 
 L «- cpAN 
 N - cpBC 
 
 STO 
 
 N *- cpBA 
 
 STOP 
 
 The obtained program is reduced. 
 
 PROOF: 
 
 ]£ 
 
 Assume S. :A *- cpB, . . . B is a useless statement in an 
 1 1 r 
 
 executable path k of the program II. 
 
 (a) If S. is useless in all paths containing it, then we 
 operate Tl to remove S-. 
 
 (b) If there exists at least one path l such that Sj_ is not 
 
 useless in I, let S- = t(C,,...,C ) be the first test statement after 
 
 7 J 1 q 
 
 S .^(C^,. . . ,0^ ) 
 
 i) Assume there is no path that includes 
 
 k k 
 t(C, ,...,C ) and does not include S. -S. 
 
 v 1' ' q / 11 
 
 is useless in path k_, so A is not 
 
 k } 
 referenced by any statement S. ,...,S_ 
 
 Thus we might use Tl| as many times as 
 
 necessary so that S. appears before the test statement. Now we use the 
 
 reverse of T5- We get 
 
 A = cpB . 
 
 cpB n 
 
57 
 
 ii) Assume there is a path that includes t(C 1 ,...,C ) and does not 
 
 include S. . 
 
 1 
 
 t \C, , . . . ,C ; 
 
 We operate Til in reverse and proceed as in case i). We 
 repeat the process of i) and ii) above as many times as necessary till 
 S- will be useless in all paths that include it and Tl can be applied. 
 
 Assume S. : A 
 
 1 
 
 cpD.. . . . D and S . : B *- cpD n . . . D are redundant 
 1 r j 1 r 
 
 in path k. If there is no path that includes S . and does not include 
 
 k / / 
 
 S. and also there is no path £ that includes S. and S. but they are 
 i i J J 
 
 not redundant in that path, we can 
 
 v 
 operate T2 and remove S . . 
 
 If there is a path that includes S . and does not include SV 
 
 J i 
 
 we operate Til in reverse 
 
 and now T2 can be applied. 
 
58 
 
 Assume there is a path I that includes S. and S. and they are 
 
 not redundant in that path 
 
 then we again operate Til in reverse 
 
 and now T2 can be applied to remove S . from path k. 
 
 J 
 
 Unreferenced "blocks can be removed by applying T9- So 
 
 n 
 
 ¥ IT. 
 
 (1,2,4,5,9,113 
 
 Lemma 6 
 
 Let IT be a program in which each path is executable and E is a well- 
 
 formed expression over 8 and Z such that E is in v (ll) for some path 
 
 I or E = v (C.) where t (C,,...,C ) is a test statement in path £. 
 
 E / Z. Let E be a well-formed expression, E-./Z, E. is a subexpression 
 
 of E, then there is a statement A «- cpA, ...A, in path l such that the 
 
 value of A computed at that time is E n . 
 
59 
 
 PROOF : 
 
 Because of the properties of well-formed expressions, there 
 is a unique way to write an expression E as cpE.....EL where E, ,...,E, 
 are subexpressions. The proof follows from the definition of the 
 value of a variable. 
 Lemma 
 
 Let II and IT' be two equivalent, reduced programs in which all paths 
 are executable and corresponding paths are consistent and have the 
 same tests, and let I and k be two corresponding paths of TT and TI 1 
 respectively. 
 Then the assignment statements of I and k are in one-to-one 
 
 correspond, then v.(A) = v (b). 
 
 1-. ■ -. u k ! 
 
 J r 
 
 //,n k, 
 
 correspondence such that if S . : A *■ cpA-. . .A and U : B «- YB, . . . B 
 
 PROOF; 
 
 Let S . : A *- cpA_...A be an assignment statement of l such 
 
 that v.(A) = E. n is reduced, therefore one of the following two 
 
 J 
 
 conditions must hold. 
 
 1) E is a subexpression of an expression in v (n). In 
 
 / k 
 
 this case since I and k are consistent and IT = IT*, v (n) = v (TI')- 
 
 Therefore E is also a subexpression of an expression in v (TI')- 
 
 2) E is a subexpression of an expression v (C, ) such that 
 
 t (C ,C ) is a test statement in path l. In this case since I and 
 
 v 1' ' n 
 
 k are consistent, corresponding, and have the same tests, there exists 
 a test statement t k (D 1 ,. . . ,D ) such that ^(Dj) = v / (C i ). Thus E is 
 also a subexpression of v (D. ). 
 
6o 
 
 By Lemma 6 there is at least one q such that v (B) = E. 
 We will prove that q is unique. Suppose there are q 1 and a for some 
 j. (if there is more than one such j select the first one and if for 
 that j there are more than two q's, select the first two.) Let 
 
 U* :B-¥.r .B 
 q-j_ 1 1 r 
 
 U* : C *- ^ n C n . . . C 
 q 2 2 1 r 
 
 v k (B) = v k (C) thus v = ¥ and v k (B.) = v k (c). Ki<r. 
 q^ q 2 1 2 q x i' q 2 v i' - - 
 
 Because of the selection of j, q, and q^ B. = C. 1 < i < r. Thus we 
 can apply T2 to n', and this contradicts the assumption that IT 1 is 
 reduced. 
 THEOREM 7 
 
 Let IT and IT' be two reduced programs in which 
 
 1) each path is executable 
 
 2) corresponding paths are consistent and have the same tests. 
 Then 
 
 n = it iff D.(n) =d.(it) Vi, 1 < i < n 
 
 i i ' — — 
 
 H^ .. . . ,1 and k n .. . . ,k are the enumerations of II and IT' respectively. 
 1' n 1 n 
 
 PROOF: 
 
 1) If for all i, 1 < i < n D. (IT) = D. (IT), by theorem 5 
 
 II y ^ ' and because transformations preserve equivalence 
 
 (3,^,5,10,11) 
 
 tt = ir 1 . 
 
 2) If TT = IT' and there is at least one i for which 
 D i (n) ^ D i (TT*), there is a node N in D i (n) which is not included in 
 
61 
 
 D.(n'), but all its descendants are present, (simplest case - where 
 
 the descendants are the input variables which are the same for both 
 
 programs. ) This contradicts Lemma 7 because we showed that the 
 
 assignment statements are in one-to-one correspondence 
 
 A«-cpA, ...A 4 — > B«-¥B,...B such that v X ( A) =v?"(B). Thus qp = ¥ 
 1 r lr j k ' 
 
 and v.(A ) = v"^(B ) 1 < q < r. Since each node of the dag is 
 J q 3 q " - 
 
 associated with an assignment statement, the contradiction follows. 
 THEOREM 8 
 
 Let TT and TT ' be two loop-free programs. Then 
 
 n = it iff n y n' . 
 
 (1,2,6,7,9,10,11) 
 
 PROOF : 
 
 1) "if" is trivial because transformations preserve 
 equivalence. 
 
 2) By Theorem 3 
 
 (*) n , * m l n' * > n: 
 {6,11} 1 {6,11} 1 
 
 where :!, , ILJ are programs in which each path is executable. 
 
 i^ = n s n' = rL] . 
 
 By Theorem k 
 
 1 {1,5,7,8,11} 1 {1,5,7,8,11} 
 
 such that corresponding paths in n and n* are consistent and have the 
 same tests and each path in fl and II' is executable. By lemmas 3 and k 
 
 * X {1,2,6,7,11} 2 {1,2,6,7,11} 2 
 
 ~ ee n^ therefore ~ = 
 
62 
 
 We transform IT and n' to equivalent reduced programs TT 
 d 2 5 
 
 and TT', respectively. By Theorem 6 
 3 
 
 2 {l,2,l+,5,9,ll} 3 2 (1,2, U, 5,9,11) 5 
 By Lemmas 2, 3 
 
 ■X- X- 
 
 2 (1,2,9,11^ 3 2 (1,2,9,11} ^ 
 
 IT - n p thus n ^ ~ n ^ * 
 
 By Theorem 7 since TT, and TT ' are reduced programs in which 
 
 3 3 
 
 each path is executable and corresponding paths are consistent and 
 
 have the same tests 
 
 Vl. 1 < i < n d.(ttJ = D.(n;) 
 ' — — i 3 x 3 
 
 v/here i. .... .1 and k n ,. . . ,k are the enumerations of IT, and TT' 
 
 1' ' n J. n 3 3 
 
 respectively. 
 
 ■k- 
 By Theorem 5 IT, > TT' . 
 
 3 {3,^,5,10,11} 5 
 
 By Lemmas 1-3 TT, =#> TT' . 
 3 {1,2,10,11}' 3 
 
 By using (***) IT, 
 
 ^ TT' 
 
 2 {1,2,9,10,11} 2 
 
 By using (**) IT * > JL" 
 
 1 [1,2,6,7,9,10,111 1 
 
 and by using (*) II ^>- TI ' . 
 
 {1,2,6,7,9,10,11} 
 
 This shows that {T1,T2,T6,T7,T9,T10,T11} form a complete set of 
 
 transformations. 
 
63 
 
 -■:::: : : .: 
 
 If J is a complete set of transformations, but no proper 
 subset of J is complete, then J is called an irredundant set of 
 transformations. 
 7 I -.. ' " 
 
 J = (T1,T2,T6,T7,T9,T10,T11) form an irredundant set of 
 transformations. 
 PROOF : 
 
 By Theorem 8 the set J is complete. We will prove the 
 
 theorem by showing that each of the transformations in J has to be in 
 
 the irredundant set. 
 
 11 
 Let S be the set {T.} 
 
 J 0=1 
 
 a) For showing that Tl has to be in J we will use Aho's and 
 
 Ullman's example (l). 
 
 n x = (P, CB,C), A) il[ = (P 1 , (B,C), A) 
 
 P: A «- ¥BC P 1 : A <- cpBC 
 
 A *■ cpBC STOP 
 
 STOP 
 
 IL =£«• II' , but there is no sequence of transformations 
 
 from S - {Tl} that can get rid of the first statement. The only 
 
 transformation that can remove an assignment statement from an 
 
 executable path is T2, and T2 eliminates redundant statement. Here 
 
 the expressions ¥BC and cpBC are different and therefore are irredundant. 
 
 Adding new paths to the program by T7 and Til will not help because 
 
 there will still be at least one path that contains the useless 
 
 statement. 
 
6k 
 
 b) We will show that T2 is in J by using again an example 
 of Ullman and Aho (l). 
 
 n 2 = (P, {C,D}, {A,B}) 
 
 A 
 
 «- cpCD 
 
 B 
 
 «- ¥AA 
 
 A 
 
 «- cpCD 
 
 STOP 
 
 n^ = (P«, (C,D), {A,B}) 
 
 P: A «- cpCD P' : A «- cpCD 
 
 B «- ¥AA 
 STOP 
 
 TI > TT ' , but no combination of transformations from 
 S - {T2} can get rid of the third statement. That is because none of 
 the statements is useless, and therefore Tl can not be applied. Also, 
 if we introduce new paths to the program by T7 and Til there will still 
 be at least one path that includes the redundant statements. Thus 
 
 2 S - {T2} 2 
 
 c) 
 t(B)3,4 
 
 3A «- cpBC y \ kA «- ¥BC 
 
 STOP 
 
 t(Bh,h 
 
 STOP 
 
 8D *- ^AA 
 
 STOP 
 
 n. 
 
 n 6 
 
 No combination of transformations of S - {T6} can change the conditional 
 test t(B)7,8 in TL- to an unconditional test and thus eliminate the 
 nonexecutable path. That is because none of the transformations 
 operate on transfer addresses of tests and decide if a conditional 
 
65 
 
 test can be made unconditional. Introducing new paths to the 
 
 program by T7 and Til will not help, since the above path will remain 
 
 nonexecutable. 
 
 d) Let IL be t(A)7,7 B «- <pCD 
 
 T7 
 
 7 B «- cpCD 
 STOP 
 
 STOP 
 
 We will show that no combination of transformations from 
 S - fT7) can eliminate the useless test. 
 
 If we look at the transformations that operate on tests- 
 T6, T8, Til then if we use Til reverse we get 
 
 t(A)7,7 
 
 t(A)7,8 
 
 Til 
 
 7B *- CpCD 
 
 B «- CpCD 
 
 STOP 
 
 STOP 
 
 8B - cpCD 
 
 STOP 
 
 and t(A) can not be eliminated from the obtained program, only by 
 
 using Til again and T7. Introducing other new paths can only be done 
 
 by using T7 and Til, or by introducing nonexecutable paths by T6 
 
 reverse. If we introduce a test by T6 reverse 
 
 t(A)3,l* 
 R 3B *■ cpCD 
 
 {6} 
 
 STOP 
 
 B - CpCD 
 
 ST01 
 
66 
 
 We can eliminate the original t(A) by using again T6 but we have now a 
 new test that can not be eliminated. T8 operates on tests but can not 
 eliminate useless tests. 
 
 e) For the case of T9, removing of unreferenced blocks, we 
 observe that T1-T5 manipulate assignment statements and do not operate 
 on the whole block, T6-T8 operate on tests, T10 flip blocks and Til 
 merge identical subgraphs, but cannot remove unreferenced blocks because 
 they do not appear on the graph. Therefore no sequence of transformations 
 from S - {T9} can get rid of unreferenced blocks. 
 
 f) t(A) 1,2 t(A) 1,2 
 1 A - cpBC 2 A <- ¥BC . 
 
 STOP STOP 
 
 2 A *- YBC 
 STOP 
 
 T10 
 
 1 A <- cpBC 
 STOP 
 
 n 
 
 10 
 
 Ko 
 
 No sequence of transformations from S - { T10) can transform 
 !!,„ to IT' . That is because no sequence of transformations can change 
 the relative positions of different blocks in the program. T'+ change 
 the relative order of assignment statements but not of blocks. Also Til 
 cannot merge these blocks because they are not equal. Introducing new 
 tests will not help since the transfer addresses of different tests 
 might be changed but the blocks themselves will not be removed. 
 
 
67 
 
 1 STOP 
 
 t(A)l,2 
 
 2 STOP 
 
 TT 
 
 11 
 
 t(A) 
 
 Til 
 
 STOP 
 
 No sequence of transformations from S - {Til} can merge these 
 subgraphs. That is because we have a STOP statement and not an assign- 
 ment statement and therefore T5 cannot be used. Introducing new paths 
 can only be done by Til, so this program cannot expand. 
 
 Theorem 9 showed that the complete set of transformations 
 found by Theorem 8, is also irredundant. 
 
68 
 
 5 . OPTIMIZATION 
 
 We would like to provide a scheme for optimization in which a 
 sequence of the transformations is applied for getting an optimal code. 
 
 DEFINITION 
 
 We say that a cost function on programs is reasonable provided 
 that (l) the cost of a program decreases if statements which are 
 never executed are deleted from the program in such a way that state- 
 ments are not added to the program. 
 
 (2) the cost of a program decreases if a statement is deleted 
 from some executable sequence of the program. 
 
 (3) the cost of a program decreases if identical subgraphs 
 
 and identical statements can be merged in such a way that test statements 
 are not added to the program. 
 
 If we consider a cost function which is some combination of 
 speed and size of programs then 
 
 (l) reduces the size of the program. When the graph of the program is 
 a tree, eliminating nonexecutable paths always reduces the size of the 
 program. When the graph is not a tree- Til reverse should be used in 
 some cases to copy subgraphs so that nonexecutable paths may be elim- 
 inated (see Theorem 2) although sometimes the subgraphs can be merged 
 again after nonexecutable paths have been eliminated by another appli- 
 cation of Til. (l) includes also removing unreferenced blocks which always 
 reduces the size of the program. (2) increases the speed of the program 
 and decreases the size in some cases, although the size in other cases 
 might be increased. The size of the program might increase when its 
 graph is a tree, if T5 is used. When the graph is not a tree- Til must 
 
6 9 
 
 be used in some cases to copy subgraphs so that statements can be deleted 
 (see Theorem 6). As in (l) in some cases subgraphs can be merged again 
 after statements have been deleted. (3) reduces the size of the program. 
 
 We may conclude that we consider a combination of speed and size 
 as reasonable, in which the size may be increased in order that the pro- 
 gram will run faster, but otherwise the size is decreased whenever possible. 
 
 Other criteria of cost than speed and size can be introduced pro- 
 vided they do not conflict with (l)-(3) above. For example reducing the 
 frequency of storing and recovering the partial results in computing 
 arithmetic expressions is a reasonable cost criterion. 
 
 An optimal program for a given program TI, will be a program 
 equivalent to TI whose cost can not be decreased. 
 
 Every optimal program is reduced (by (l) and (2) above) and if 
 the graph of the program is a tree- each path of an optimal program is 
 executable (by (l)). 
 
 The following theorem will give a general procedure for optimiz- 
 ing programs with no loops under reasonable cost functions, assuming the 
 cost function is independent of the costs of conditional tests in the 
 program. Reducing the number of conditional tests in the program can be 
 done by reordering the tests. Reordering the tests also might minimize 
 the average time through them when more information is given on the costs 
 of different tests and the probabilities of their outcome. This type of 
 optimization involves using T6, T7, Til to add new tests and nonexecutable 
 paths to the program and to eliminate other tests that become useless and 
 paths that become nonexecutable. There is some difficulty in finding an 
 efficient procedure for implementing this process using the transformations. 
 Also if different sequences of tests result in executing different sequences 
 
70 
 
 of assignment statements in the program, reordering of tests will not 
 reduce the number of tests in the program. Therefore generally this type 
 of optimization does not improve the code generated. 
 
 Although the optimization procedure of Theorem 10 does not 
 minimize the number of tests in the program, it merges conditional tests 
 appearing in identical subgraphs and it eliminates conditional branches 
 to nonexecutable paths. 
 
 THEOREM 10 
 
 Assuming the cost function is independent of the costs of con- 
 ditional tests in the program, there is an algorithm that finds an optimal 
 program II ' equivalent to a given program II. The algorithm operates in a 
 series of steps so that II is first transformed to an equivalent reduced 
 program II which is independent of the specific cost function used. 
 Using T8 repeatedly II is nondeterministically transformed to a program 
 II". n' is obtained by operating {T1,T3,T^,T5,T7,T11} onll". 
 
 PROOF : 
 
 We first assume that all nonexecutable paths can be eliminated 
 from II in such a way that statements are not added to the program. Thus 
 in every optimal program II 1 each path is executable. Also every optimal 
 program is reduced. So take H to be a reduced program equivalent to It 
 in which each path is executable and no two statements in a path define 
 the same variable. We can obtain II by first eliminating nonexecutable 
 paths by T6 and Til (Th.3) then reducing the obtained program by operat- 
 ing T1,T2,T*<-,T5,T9,T11 (Th.6) and then renaming variables by T3 such that 
 no two statements in a path of II, define the same variable. Thus 
 
 n 
 
 1,2,3,^5,6,9,13 K l" 
 
71 
 
 H and n ' are reduced programs in which each path is executable and 
 II ■ II, H* ■ TI, therefore IT = IP. Using Th.^ we can find two pro- 
 
 grams 1 m*!',^ , 1>5 ^,8,llf *' "' (1,5,7,8,11? *' suoh that 
 corresponding paths in ¥ and ^ ' are consistent and have the same tests. 
 II = 11' therefore ¥ = ¥ * . By looking at the algorithm of Th. k we observe 
 that if IT and II ' are reduced programs ¥ and ¥' are also reduced. This 
 is because the algorithm does not add useless statements and unrefer- 
 enced blocks and also statements can be added to the programs in such a 
 way that there are no redundant statements. By Th. 7 for all 
 i,D t Cf) . D.m and * Th. 5 * ( ; ^^ » *'. ^ ^J^) 
 
 * [3,^,5,lO,lir *' {1,5,7,8,11? n ' thUS E l d,3>, 5, 7,8,10, ll? n '- 
 
 Because removing of unconditional useless test statements might 
 expose pairs of tests that can be flipped by T8, we must operate T7 
 in the forward direction before operating T8. Starting with the program TI 
 we apply T10 to flip blocks and place after each unconditional test 
 t(A 1 ,...,A )k,k in which k references the first statement of a block, 
 the block referenced by k. This allows us to use T7 to remove all 
 unnecessary unconditional tests. Next we apply TI to remove any assign- 
 ment statement which define variables that are referenced only by the 
 unconditional tests Just removed. The resulting program is called II . 
 
 n 
 
 1 [1,1,10)* K c and : "c {l,3, J s5,7,e,10,ll? n ' 
 
 All the steps in going from TI to II were deterministic. TI 
 
 c c 
 
 is independent of the given cost function. 
 
 Because flipping of tests might expose identical subgraphs and 
 also might make merging of identical statements possible, we will first 
 operate T8. We will operate T8 as many times as necessary so that 
 
72 
 
 identical statements can be merged later, and also that identical sub- 
 graphs can be merged in such a way that the cost of the program is not 
 ■eased. We obtain a program II" such that H =^H". 
 
 Because applying T8 does not change the number of statements 
 of the program, II " can be found in a finite number of steps, although 
 a heuristic procedure might help to find II " quickly. Also different 
 cost functions will cause different II" to be found. So an algorithm 
 (or heuristic) based on the nature of the specific cost function might 
 be useful. 
 
 D *- ^CE 
 
 STOP STOP STOP 
 
 the statement D *- cpAC might be translated into the following machine 
 instructions : 
 
 a -» ace 
 
 ace + c -* ace 
 
 ace -* d 
 where a,c,d are the addresses of A,C,D respectively and the interpretation 
 In (cpAC) - +AC. t(A) might be translated into: 
 
 a -* ace 
 
 t(acc) 
 thus if we flip the tests t(A) and t(B) we can save one of the instructions 
 a -» ace. Therefore flipping of tests may improve the machine code obtained 
 for some given cost functions. 
 
73 
 
 Because H , , . ■ . « , . , , ^ IT ' and we used T8 to obtai- 
 c 11,3,^,5,7,0,10,117 
 
 it follows that "." r. -, 1 ■_-,.,, , =f^ ['. All applications of To can be 
 
 11,3,4,5, . . Llf 
 
 performed before applying {T1,T3,T 1 +,T5,T7,T10,T11} . That is because 
 T3 , T^ , T10 , Til and T5 in the forward direction do not expose new tests 
 that can be flipped and all operations of T7 and Tl have been performed 
 except those resulting from merging of paths. 
 
 It is easy to see that T5 and Til can be operated independently 
 on II" in the sense that commuting the transformations will result in 
 programs having the same cost. That is becuase if the subgraphs are 
 identical and the two identical statements to be merged are included in 
 both subgraphs, the resulting program is the same if we first merge the 
 identical statements and later merge the identical subgraphs, or if we 
 first merge the identical subgraphs and then merge the identical state- 
 ments. If only one statement of the two identical statements to be 
 merged is included in the identical subgraphs: 
 
 A*<PB. . \ B 
 r 
 
 if we first merge the identical statements and then the identical subgraphs 
 we get 
 
7* 
 
 A 
 
 (5,11) 
 
 cpB . ..B 
 
 1 c 
 
 and if we first merge the subgraphs we get 
 
 (ii; 
 
 * 
 
 A «- CpB . . . B 
 1 r . 
 
 which has the same cost as the program above. By applying T5 twice we 
 can get the same result as above. 
 
 A^-cpBj^ 
 
 Thus we first operate T5 on TT" as many times as necessary to 
 merge identical statements ( we may have to use T3 and T^ to be able to 
 do iO and then we use Til to merge identical subgraphs. As was shown 
 
75 
 
 above further applications of T5 might be used to get other programs with 
 the same cost. Til will be operated only in those cases in which identi- 
 cal subgraphs can be merged in such a way that tests are not added to the 
 program. 
 
 Merging identical subgraphs might expose useless unconditional 
 tests, so T7 has to be applied again. Again useless assignment state- 
 ments can be removed by Tl. We get a program we shall call IT . We have 
 so far used T1,T3,T4,T5,T7 and Til to transform IT" into II . Next we have 
 to use T10 and again T3 and T^ to obtain the optimal program n ' . 
 
 T10 ( flipping of blocks ) does not change the cost of the 
 program, it even does not change the graph of the program. So we might 
 conclude that for a program II, there might be several optimal programs 
 all having the same cost and all having the same graph and all are equiva- 
 lent by T10 only. Let II 1 be one of them. Further applications of T10 to 
 IT' may be used to generate other optimal programs. 
 
 is obtained from IT by applying T3 and T*+. 
 
 The following part of the proof will be based on the arguments 
 of Ullman and Aho (l) for the straight line code case which apply also 
 here. 
 
 Since in *T no variable is defined twice, the steps used in 
 going from II 2 to II ' can be reordered so that all applications of T^+ are 
 performed before applying T3. So the program II, is a reordering of inde- 
 pendent statements of n ? , Tig wf ] K and H' i s renaming of the variables of 
 
 The step in going from IT to IT, is nondeterministic. There are 
 only finitely many possibilities for TT^ because the number of statements 
 
76 
 is not changed, but to avoid an exhaustive search for the optimal n 
 same algorithm (or heuristics) based on the nature of the specific 
 cost function can be used to get II quicker. 
 
 Since renaming does not change the cost- H ' could be taken to 
 be equal to II , and further applications of T3 may again be used to gen- 
 erate other optimal programs. 
 
 In the first part of the proof of Theorem 10, we assumed that 
 nonexecutable paths can be eliminated so that statements are not added 
 
 to the program. Therefore in the optimal program each path is executable. 
 
 Assume that not all nonexecutable paths of the program can be 
 eliminated so that statements are not added to the program. Therefore 
 take TI to be a reduced program equivalent to II so that no two statements 
 
 * v 
 
 in a path define the same variable. n r > ,- , ,-j* PI, • 
 
 n = IT, H 1 = IT therefore IT = IT. 
 
 Let ¥ and ^ ' be two programs equivalent to TL and II 1 respec- 
 tively, so that in ^ and x l" each path is executable. By Theorem 3 
 
 n i ?W n '?M!f' i "- 
 
 Let ^ and ^' be two programs equivalent to ¥ and ^' respectively 
 so that corresponding paths in ^. and ^' are consistent and have the same 
 
 tests. By Th. k 
 
 * 1 1,5,7,8,11} '*l r U,5,7,8,lir*i ' 
 ^ 1 and ¥' are reduced, therefore by Th. 7 D. (¥ ) = D.(¥') and by Th. 5 
 
 -x- 
 
 i i3,it,5,io,iir i' 
 
 Starting with the program II we operate T6 and Til in all the 
 cases that the number of statements added is not greater that the number 
 
77 
 
 of statements eliminated. The rest of the algorithm is the same as in 
 
 the first part of the proof of Theorem 10. We get a program II , 
 
 IT. . , ■ ....,.,> "" . This completes the proof of Theorem 10. 
 1 [1,6, 7, 10, 11 J c 
 
 We presented a program schema that models loop-free programs. 
 
 We found a complete set of equivalence preserving transformations and 
 
 showed that this set is also irredundant. In this chapter we provided 
 
 a procedure for optimization using the set of transformations. The 
 
 results obtained will be extended to the case in which algebraic laws 
 
 are assumed, and to the case in which the tests are Boolean combinations 
 
 of elemental predicates. Also we will show that the results obtained 
 
 hold for a broader class of programs- programs which always halt. 
 
78 
 
 6. ALGEBRAIC TRANSFORMATIONS 
 
 Algebraic laws that are known to hold among operators can often 
 be used to expose common subexpressions in the program. In the follow- 
 ing example, the commutative and the associative laws of addition are 
 used to expose a common subexpression which can later be eliminated, so 
 that the cost of the program will be reduced. 
 
 EXAMPLE Ik 
 
 n = (G, {A,B,C} , {F,G} ) 
 
 G - -AC 
 
 STOP 
 
 acsoc 'a 
 
 la,; » 
 
 commutative 
 law 
 
 X - +CB 
 F *- +AX 
 
 G - +AC 
 
 STOP 
 
 G «- -AC 
 
 STO 
 
 G *- +AC 
 
 STOP 
 
 n 
 
 T2 
 
 G «- +AC 
 
 F «- +GB 
 
 t(F) 
 
 G - -AC 
 
 STOP 
 
 G <- +AC 
 
 STOP 
 
 G «- -AC 
 
 STOP 
 
 STOP 
 
 IP 
 
 The common subexpressions above could not be exposed using the 
 topological transformations only. 
 
 In this chapter we use a model of program schemata in which 
 assignment statements are of the form 
 
 A <- ¥B.,...B 
 
 1 r 
 
79 
 
 A,B , ...,B e Z and ^ is an r-ary operator from a countable set of operators 
 
 <f over the domain D. 
 
 To provide an actual program for a given program we assign values 
 from the domain to the input variables and make a proper assignment of 
 predicates to the test names of the schema. 
 
 The model assumes that a set of algebraic laws F holds among 
 operators from $. 
 
 DEFINITION 
 
 An algebraic identity is a pair (a, 3) where a, 3 are well- formed 
 expressions over <J> and Z. 
 
 imples of algebraic identities are the associative law of 
 addition '+A+BC, ++ABC) and the distributive law (*A+BC,+*AB*AC) 
 
 Let E and E be two well- formed expressions, and y = (a, 3) 
 is an algebraic identity. E is transformed to E^ by y iff for each 
 
 appearing in a or P there is a well-formed expression E such that if 
 i' and 3' are o< and P with E substituted for each instance of x, then 
 either E ± = b ± a 1 5 2 and E 2 = b ± 3' 5 g or E ± = b ± p' & 2 and E 2 = r ' . . 
 
 DEFINITION 
 
 Let E, and E^ be two well- formed expressions and r is a set of 
 
 12 
 
 algebraic laws. E.. and E are equivalent under F ,(E = E ) iff E 
 can be transformed into E by a finite sequence of laws in P . 
 
80 
 
 DEFINITION 
 
 Let IT and IT* be two programs. Let i and k be two executable 
 paths of II and IT respectively. I and k are said to be consistent 
 under r iff for all t 
 
 t is in the sequences of statements that correspond to both / and k, 
 
 so t'.at in pp.th i there :5 s a statement S. or the form t 'C , . . . ,<Z ) n ,.n^, 
 
 k k 
 
 and in path k there is a statement S. of the form t (D.,...,D ) m ,m 2 and 
 
 I k 
 
 (ii) v. (C ) = v. (D ) for all q , 1 < q < r, that is the expressions 
 lqrJq. -t-'_~l_ 
 
 I k 
 
 v,-(C ) and v-(D ) are equivalent under r for all q, then the statement 
 1 q J q 
 
 prefixed by n.. is included in path I iff the statement prefixed by m is 
 included in path k, and the statement prefixed by n~ is included in path £ 
 
 iff the statement prefixed by m. is included in path k. 
 
 For an aribtary set of algebraic identities it is recursively 
 unsolvable to determine given two expressions E and E whether E = E , 
 thus it is undecidable if two paths are consistent. 
 
 In Example lk above the two left paths £, and k of II and H' 
 respectively are consistent under a set of algebraic identities T that 
 includes the associative and the commutative laws of addition. 
 
 v (F) = -i-A+BC 
 
 k l 
 v (F) = ++ACB 
 
 +A+BC = ++ACB 
 
 DEFINITION 
 
 Two program schemata H and II ' will be called equivalent under a 
 
 set of algebraic laws I (Ji =; II') iff for all consistent pairs of paths 
 
 11 k 
 
 under P £,k, for each expression E in v (II) there is an expression IT 
 
 k i. k 
 
 in v (II') such that E = E , and conversely. 
 
 
81 
 
 Because it is undecidable for an arbitary V whether two expressions 
 are equivalent under V , the question of equivalence under arbitary r for 
 two programs II and II' is undecidable. 
 
 Algebraic laws on expressions induce transformations on dags. 
 Thus algebraic transformations on programs can be described as transforma- 
 tions on the corresponding dags. 
 
 If the distributive law holds for expressions a, 3, y 
 
 *a + 37 = +*a3*o7 
 then the transformation on the dag is 
 
 D l D 2 
 
 The transformation on dags induce a natural transformation on 
 any given dag of the program: we replace each instance of D. with D p . 
 
 Before replacing instances of dags that correspond to algebraic 
 identities in a dag of a given program, we must make sure that 
 
 1) No node of the dag replaced (which is not a root) is distin- 
 quished. (otherwise distinguished nodes associated with output variables 
 and with variables referenced by tests might be eliminated by the trans- 
 formation) . 
 
 2) No node of the dag replaced (which is not the root or a leaf) 
 has ancestors which are not nodes of the dag replaced in all the dags that 
 include the node, (because nodes might be eliminated or their values might 
 be changed by an algebraic transformation) . 
 
82 
 
 3) No node is not distinguished and has no ancestors (because 
 then the node is useless, and there is no use in transformations on 
 useless statements) . 
 
 k) Til has been used so that there is no path that includes a 
 subexpression of the expression replaced, but not the expression itself. 
 
 DEFINITION 
 
 We will call an open program a program in which 
 
 a) no output variable is referenced after it is defined. 
 
 b) no variable not in I, is referenced more than once (in all 
 paths that include the definition of this variable) . 
 
 c) Tl can not be applied. 
 
 d) Til reverse can not be applied. 
 
 e) no two output variables have the same value, 
 a-d will guarantee 1-^ above. 
 
 By Lemma 8 below, each program can be transformed to an equivalent 
 open program. Thus if I is an algebraic identity and IT and H' are two 
 programs in which corresponding paths are consistent under I we say 
 
 n =H>n» iff d. (n_) ==>D.(n') 
 
 I i I i 
 
 where II and II' are open programs equivalent to II and H ' respectively, and 
 D. are all the dags in D(H ) that include the subtree corresponding to 
 the expression transformed by I. 
 
 Because by Theorem 5 dags characterize equivalence classes 
 under the transformations T3,T^,T5,T10,T11, the definition above allows 
 I to incorporate all these transformations. 
 
83 
 
 We will be interested in algebraic transformations that operate 
 on programs in which each path is executable, because there is no use to 
 expose common subexpressions in nonexecutable paths. For an arbitrary- 
 set of identities r, it is undecidable if a path is executable if algebraic 
 laws are assumed to hold among operators. For special cases of algebraic 
 identities getting rid of nonexecutable paths can be done by a similar 
 procedure to that of Theorem 3. We apply a transformation similar to 
 T6 in which t'.-.c left path is removed if t(A) 
 
 v(A) = v(B). 
 
 '.-■■■■ 
 
 If II is a program in which each path is executable, there is an 
 open program IT such that II =11 and II m p q l lT^O' 
 
 :■?." - : 
 
 First we reduce II so that useless statements will be eliminated 
 
 and no two output variables will have the same value. We get a program 
 
 * 
 II , II , J L . Now we operate Til reverse as many times as necessary 
 1 11,2,9* 11 J 1 
 
 so that we get a tree. We rename II such that no variable will be defined 
 
 more that once in the same path (T3)- We getllg, 1^ ."^Ilg and by 
 
 * 
 Lemma 1,11 n ,- : > II . Now we use T2 reversed (inserting redundant 
 1 (l,d,llj d 
 
 statements) as many times as necessary so that no output variable will 
 be referenced after it is defined, and also that no variable will be 
 referenced more than once. We get an open programll , 11^ T^T n and 
 
 " (l,2*9,ll/V 
 
8U 
 
 We will prove an equivalent theorem to Th. 8 for the case that 
 algebraic identities are assumed: 
 
 THEOREM 11 
 
 Let II and IV be two programs in which each path is executable. 
 Let r be a set of algebraic identities. Then 
 
 II = IT iff n f1 ^ > „ n - n -.,. 7=^TT. 
 
 r {1,2,6,7,9,10,11) u r 
 
 When algebraic identities are assumed to hold among operators, 
 the topological transformations that operate on tests consider identical 
 tests as tests checking values that are equivalent under F. When equiva- 
 lence under F of two programs is decidable, it is decidable whether the 
 values checked by tests appearing in the program are equivalent under F. 
 
 If IT and II' are two programs in which each path is executable 
 
 and n = IT, then II ■ g * ft ., =»¥ , IT fl c * ft n = #-^ such that in Y and V 
 r {l,5,7,o,llJ {1,5,7,«,11J 
 
 corresponding paths are consistent under r . The procedure is the same 
 as in Theorem h, only here tests that check equivalent expressions under 
 F are considered to be identical. 
 
 PROOF: 
 
 1) "if" is trivial because the transformations preserve 
 equivalence . 
 
 2) Let ¥ and ^ ' be two programs equivalent to IT and 11' 
 respectively, in which corresponding paths are consistent under r. 
 
 11 {l,5,7,a,ur'- n ' {l, 5 ,7,8,llf ,! " 
 and by Le^as 3 and k K a . g<6 * 7|U ^ Y H ' [1)g ,6, 7>11 ^ ' • 
 
 II = II ' therefore ¥ = ¥* . 
 Let IIq and TIq be open programs for ^ and ^ ' respectively. 
 
85 
 
 * (1,2,9,11^ ■;-. .'.r.> • V n o' 
 
 ^tv(n )= >l; = < 
 
 E 1 E 1 I n 
 *1 ' ' q ( i- 
 
 There is a finite sequence of algebraic identities that operate 
 
 on each set {E. } ._. 1 < j < q and convert it to the set of expressions 
 
 k. k. 
 
 fE'. 1 } . n where {{E'. 1 } } ^ = v(ll'). So there is a sequence of 
 w j i=l j i=l j=l ^ 
 
 program values v, , . . .,v such that v. = v(n.) v = v(TI') and v. .. is 
 ^ 1 r 1 r l+l 
 
 created from v. by applying one identity in r to one or more expressions 
 
 in v. . . An identity operates simultaneously on several expressions that 
 are the values of the same output variable in all the paths of the program. 
 
 We will show that if P is an open program and v(p) = v., then 
 there is a program P 1 such that v(p') = v. 1 and P , ? s =^ ■ '. 
 
 Let I be the identity applied in going from v. to v. , • Since 
 Ls open, I is applicable to the dags in D( •:>) . After operating on all 
 the dags D.(p) that include the subtree corresponding to the expression 
 transformed by I, we get a new se iags D.. whose value is v. . Let 
 p' be such that D(o') = D . Thus P"*p'. Also because by Theorem 5 dags 
 
 characterize equivalence classes under T3 , Th , T5 , T10 , Til P . . , , .> >"'. 
 (We used Lemmas 1,2,3). 
 
 Therefore II fl . > , .. ,„' 'H' . 
 
 (1,2,6,7,9,10,11) u r 
 
 The following theorem will give a procedure for optimizing 
 programs under operator and operand preserving algebraic identities. 
 
 DEFINITION 
 
 An algebraic identity is operator preserving if the number of 
 operators on both sides of the identity is the same. 
 
/ 
 
 The identity --ABC ■ -A+BC is operator preserving. Under an 
 operator preserving identity the number ■ Lgnment statements is 
 preserved. 
 
 DEFINITION 
 
 An algebraic identity is operand preserving if each operand either 
 i) appears exactly once on each side of the identity, or 
 ii) all its appearances on a side of the identity follow the same instance 
 of an operator. 
 
 Under an operand preserving identity, openness of programs is 
 preserved. 
 
 The definition of operand preserving identity includes that 
 of Aho and Ullman (l) . 
 
 **XXY = *Y*XX is an operand and operator preserving identity. 
 
 THEOREM 12 
 
 Let P be a set of operator and operand preserving identities. 
 
 There is an algorithm that finds for every program IT an optimal 
 program TI ' such that II -p H', The algorithm operates in a series of steps 
 so that II is first transformed to an equivalent open program II . Using 
 
 
 algebraic transformations only, II is then nondeterminis tic ally transformed ! 
 to II' . TI' is obtained by operating the topological transformations 
 {T1,T2,T3,T^,T5,T7,T8,T10,T11) on II . 
 
 PROOF : 
 
 Let TI be an open program equivalent to II in which each path is 
 executable. H can be obtained by first eliminating nonexecutable paths 
 
87 
 
 by a similar procedure to that of Theorem 3, and then costructing an open 
 program by using T1,T2,T9,T11 (Lemma 8). 
 
 Since II is open and the algebraic identities in r preserve 
 openness, the algebraic transformations do not affect the relative position 
 of branch nodes of the dags that are not replaced by the algebraic trans- 
 formations. Thus all the algebraic transformations can be applied before 
 applying other transformations that operate on assignment statements. It 
 is also clear that algebraic transformations can be applied before applying 
 transformations that operate on tests, because they change only assign- 
 ment statements. Thus we can transform H by applying algebraic trans- 
 formations only, into a program II' such that II' =11'. 
 
 Operator and operand preserving identities preserve openness, 
 so that the algebraic transformations can be applied sequentially, without 
 adding statements for making the obtained programs open. Because the 
 number of statements is preserved, II' can be found in a finite number 
 of steps, although a heuri. dure might help to find II' quicker. 
 Now we reduce II' and apply T3 bo rename variables, such that no two state- 
 ments in a path will define the same variable. We obtain a program II . 
 The rest of the proof is exactly as in Theorem 10. 
 
 There are two generally recognized ways to optimize code generated 
 by a compiler: global optimization which is concerned with the whole program, 
 and local optimization which depends only on information in a single ex- 
 pression or statement. Theorems 10 and 12 provided schemes for global 
 optimization without algebraic identities, and with certain types of 
 identities, respective 
 
88 
 
 Seme of the local optimization techniques presented in litera- 
 ture Baywell (3) Lowry and Medlock ^7) could be viewed as algebraic iden- 
 tities that hold among operators, operands and constants. Examples 
 algebraic identities that induce local optimization of code are 
 
 X * 2 = X + X 
 
 X ** 2 = X * X 
 
 X ** \ = SQRT (X) 
 
 A ** (-C) = 
 
 A**C 
 A ** 2. = A ** 2 
 
 X * 1 = X 
 
 X + = X 
 
 In these cases the object of optimizations under algebraic 
 identities is not to expose common subexpressions as before, but to 
 locally improve the machine code generated. Improvements are usually 
 done by replacing some operators with operators that are known to be 
 more efficient in these special cases. 
 
 In the case of global optimization, a cost was reasonable 
 provided that the cost decreased if a statement was deleted from some 
 executable sequence of the program. In the cases of local optimization, 
 a cost criterion is called reasonable provided that the cost decreases 
 if a subsequence of statements in some executable sequence of the program 
 is replaced by another subsequence of statements which is not longer 
 and which runs faster. 
 
 We will extend the procedure in Theorem 12 to apply to certain 
 types of algebraic identities that induce local optimization of code. 
 
89 
 
 Because these identities operate on constants in addition to 
 operands and operators, we will extend the definition of a program schema 
 to include constants. The extension does not pose any difficulties and 
 will be described informally. 
 
 ~\ <$> . T will be as before and C will be a countable set of 
 constants . 
 assignment statements will be of the type 
 
 k A - 0B n . . . B 
 
 1 r 
 
 A € S, B n , . . • ,B e E U C 1 and 3i 1 < i < r, B. e E 
 1 r — — l 
 
 ;est statements will be of the type 
 
 t(C^, ...,C^) k.., k 
 
 C,,...,C € E U C and 3i l^i^r, C. eE 
 1* ' r ' i 
 
 A program schema will be a quadruple (P, I, U, C) where P, I, U 
 are as before, and C is a finite set of constants. 
 
 Program equivalence will be defined as before. 
 
 The graphical representation of programs as dags will be as 
 before, and a leaf will be created for each ceC in all the dags of the 
 progr-. 
 
 The definitions of operand and operator preserving identities 
 will be extended to include constants. The restrictions on operators 
 and operands are as before but constants can appear anywhere on the two 
 sides of the identity. By the new definition the identities 
 
 X * 2 = X + 
 
 X * X 
 
 X ** | = SQRT(X) 
 
A **(-C) 
 
 90 
 
 1 
 
 A ** C 
 A ** 2. - A ** 2 
 
 are all operand and operator preserving. 
 
 A+X*1=A+X and 
 
 A+X + = A+X 
 are not, because they are not operator preserving. 
 
 Under the new definition the number of statements is preserved, 
 and also the openness of the programs is preserved because constants appear 
 
 as leaves in the dags of the program. Therefore the scheme for optimiza- 
 tion provided by Theorem 12 applies also to the extended case of operator 
 and operand preserving algebraic identities. The cost function is reason- 
 able also in the sense that the cost decreases if local improvements are 
 made. Thus Theorem 12 provides a scheme for global and local optimiza- 
 tion where certain types of algebraic identities are assumed. 
 
91 
 
 7. LOGICAL TRANSFORMATIONS 
 
 We considered a program schema that contained a countable set T 
 of test names. The elements of T will be called elementary tests . 
 
 : : • :: 
 
 A test is either an 
 
 (1) elementary test, or 
 
 (2) an expression of the form p v r where p, r are tests, or 
 
 (3) an expression of the form p.r where p, r are tests, or 
 (h) an expression of the form p where p is a test. 
 
 The tests with the binary operations >• and . and with the set B 
 consisting of the two constants T, F can be shown to be the algebra of 
 Boolean functions of the elementary tests. 
 
 The definition of a program schema is extended to include test 
 statements that have both elementary tests and tests. 
 
 An interpretation is defined as before. 
 
 The definitions of executable and nonexecutabl i_ paths are as in 
 the case of program schemata that have elementary tests only. 
 
 Boolean manipulations on the tests may reduce the cost of the 
 program. In the following example we apply Boolean manipulations to 
 expose nonexecutable paths that can later be eliminated so that the cost 
 of the program is reduced. 
 
92 
 
 EXAMPLE 15 
 
 
 
 
 
 
 n 
 
 = 
 
 (o, 
 
 (A, 
 
 B] 
 
 ,{D}0 
 
 ts/-t 
 
 , 
 
 B) 
 
 
 
 H A 2 
 
 '• •'■ AS 
 
 EK^AB 
 
 IV0CR 
 
 t 2 (A,B) 
 
 PCB 
 STOP ^ r^) 
 EH-eCB y' \D-CpCB 
 
 STOP 
 
 STOP 
 
 The path that includes the tests t (A,B) and the left branch 
 of ~t , (A) and the path that includes t (A,B), t (A,B) and the right branch 
 of ~t (A) are nonexecutable. By eliminating them we get 
 
 ^(A.B) 
 
 IT = 
 
 (XPAB. 
 D'^PCB, 
 
 STOP 
 
 t (A,B) 
 
 C<-<PAB / \ D+^AB 
 I>SCB y 
 
 STOP 
 
 The logical transformations exposed nonexecutable paths of the 
 program, also tests were eliminated from some executable sequences of the 
 program. Thus logical transformations, in addition to topological and 
 algebraic transformations, might reduce the cost of programs that have 
 tests. 
 
 Optimizing compilers that use logical transformations have been 
 built (see (5)). In FORTRAN, logical transformations may simplify the 
 
93 
 
 logical expressions in IF statements. See the reference above and also 
 Lowry and Medlock ( 7) • 
 
 The cost of the program can also be reduced by simplifying the 
 Boolean expressions: 
 
 P*? (A) p .r (A) 
 
 C - 
 
 
 «-¥A 
 
 
 C - cpAA 
 
 STOP 
 
 C «-¥A 
 
 ST n P 
 
 Two identical tests checking different values are considered to 
 be different variables in our Boolean algebra: 
 
 EXAMPLE 16 
 
 A - CPXY 
 
 . TO] 
 
 C - 
 
 a - ecx 
 
 STOP 
 
 A - cpXY 
 
 
9k 
 
 The value of the variable A checked by the first p(A) test is 9XY and 
 by the second p(A) test is d^XYX. Therefore the marked path can not be 
 eliminated. 
 
 We will define the notion of equivalence of loop-free programs 
 that have tests in addition to elementary tests. 
 
 DEFINITION 
 
 Let II be a loop-free program that has tests. With each executable 
 
 I. 
 path I. of the program we will associate a path condition ¥ ' which is the 
 
 condition that path I. is executed, and is a Boolean function of the tests. 
 
 Identical tests checking different values are considered as 
 
 different variables of the Boolean function. 
 
 DEFINITION 
 
 Let II and IT ' be two programs that have tests and i and k are 
 
 two executable paths of IT and IT' respectively. I and k are said to be 
 
 I k 
 
 consistent iff ^ and ¥ are Boolean functions of the elementary tests 
 
 t.,,t_,...,t , there is some assignment of truth values a.,..., a to the 
 1 2' m 1 m 
 
 I k 
 
 variables t n , t^,...,t which makes both ^ and ^ true. 
 
 12 m 
 
 Notice that the definition of consistent paths for programs 
 
 that have elementary tests only, is a special case of this definition where 
 
 I k 
 
 ¥ and ^ are products of the elementary tests appearing in the paths. 
 
 DEFINITION 
 
 Let II and II' be two loop-free programs that have tests. II and 
 11' will be called equivalent iff for all interpretations In (of input 
 
 values, elementary tests and functions) Val (IT) = Val (IT'). 
 
 In In 
 
 
95 
 
 THEOREM 13 
 
 Let n and IT' be two programs that have tests. II == IT' iff for 
 
 I k 
 
 all consistent pairs of paths /, k, v (H) = v (IT . 
 
 PROOF: 
 
 The proof is similar to that of Theorem 2. 
 
 For l) we take the same interpretation In of inputs and 
 functions with the following interpretation for elementary tests: 
 
 For all elementary tests t in path j=i,k 
 
 ±) determine In( t J (aL , . . . ,a )) for (a,,..., a ) such that a. = v J (X.) 
 
 1 n v j.' ' n nr i' 
 
 1 < i < n and t (X , ...,X ) appears in the test statement S^,in such a way 
 
 I k 
 
 that both ¥ and ^ are true. 
 
 ii) In(t J (a , . . . ,x ) ) is arbitrary for other n-tuples (a , . . . ,a ) . 
 
 i and k are consistent therefore there is an assignment of the 
 elementary tests which makes both ^ and ^ true. Under this interpreta- 
 tion paths £ and k are executed in II and II' respectively. 
 
 The rest of 1) and also 2) are the same as in Theorem 2. 
 
 EXAMPLE 17 
 
 Let II and II' be of Example 15. The pairs of paths (i ,k ), 
 (i^,k ), (£ ,k ) are consistent and their values are identical. 
 
 * 1 - (t x . t 2 ) l ± - t 2 t x 
 
 *2 - 
 ¥ - t t 
 
 * " t l t 2 
 
 I I k 2 
 
 ¥ " = ¥ thus there is an assignment of truth values to t.. 
 
 - f l "-2 
 and t that makes both " and ^ true. Therefore £, and k p are consistent. 
 
 *1 k 2 
 
 v " 9<PABI v (IT) 
 
96 
 
 f - - (^ v t 2 ) € 1= t x v t x t 2 =t. 
 ^ X = t 
 
 thus ¥ * = * X 
 
 '? k l 
 
 v (n) = WABB = v (IT) 
 
 * " t l ~ *2 
 k 3 - - 
 
 I k 
 
 thus ¥ ^ * ¥ ^. 
 
 £ k 
 
 v 3 (n) ■-- ^ab = v 3 (n') 
 
 In II and H' consistent paths have the same value, therefore by Theorem 13 
 
 n ■ it. 
 
 EXAMPLE 18 
 
 The following logical transformations preserve equivalence. 
 
 i) 
 
 ii) 
 
97 
 
 iii) 
 
 h 
 
 -__ :_ 
 
 t k 
 
 In i) l " = t_ v t ,^ = t . There is an assignment of truth 
 
 l l k l 
 
 values to t and t which makes both ¥ '" and ^ true, (take t.=T, t =F). 
 
 Thus / and k are consistent. The values of I and k are identical. 
 
 * = t l t 2" The assi g nmen ' fc t 1 =F » t 2 =T makes both ^ ' and ^ true, 
 therefore k and I are consistent. The values of kp and i are identical. 
 
 ¥ 2 
 
 1 
 identical 
 
 
 Thus f p and k are consistent. Their values are 
 
 Therefore L preserves equivalence. 
 
 l l k l 
 
 ii) ¥ " = t t = ¥ . The values of I and k.. are identical. 
 
 « 2 k 2 
 
 ¥ - t t ¥ - t t 
 
 f 2 k 2 
 
 The assignment t =T, t =F makes both ¥ ' and ¥ ' true, therefore f and k 
 
 are consistent. The values of I and k are identical. 
 
 k 3 " 
 f *1 
 
 Y 2 = ^ V, t 2 
 
 S 
 
 The assignment t-=F, tp=F make ¥ '" and 4f ^ true, therefore l p and 
 k_ are consistent. Their values are identical. 
 
 Therefore Iv, preserves equivalence. 
 iii) The pairs (I ,k^), ( l p , k ) are consistent and their values are the 
 same, therefore L, preserves equivalence. 
 
98 
 
 i-'INITION 
 
 Any Boolean manipulation on the tests that preserve equivalence 
 will be called a logical transformation on the program. 
 
 We will prove an equivalent theorem to Theorem 8, for the case 
 that logical transformations are included: 
 
 THEOREM Ik 
 
 Let 11 and IT 1 be two loop- free programs that have tests. Then 
 
 nSlT ' lff E {l,2,6,7,9,10,ll}UL Vn ' 
 where L is a set of logical transformations. 
 
 PROOF : 
 
 1) "if" is trivial because logical transformations and top- 
 ological transformations preserve equivalence of programs . 
 
 2) For every loop-free program with tests there is an equiva- 
 lent loop- free program with elementary tests only, which is obtained by 
 applying logical transformations. We will show that by induction on the 
 structure of a test. 
 
 If a test is of the form ps/r then 
 
 .1 
 
 / \s <, y s~ / 
 
 r 
 
 s r 
 
 i r 
 
 where S and S are the left and right subtrees, respectively. fWe 
 
 assume that the graph of the given program is a tree. This assumption 
 
 does not cause any difficulties since by repeated applications of Til 
 
 in reverse any program can be transformed to an equivalent program whose 
 
 graph is a tree) . 
 
 
99 
 
 If a test is of the form p.r then 
 p.r 
 
 ^=^ 
 
 If a test is of the form p then 
 
 *=^> 
 
 These transformations preserve equivalence (see Example 18) 
 
 EXAMPLE: 
 
 = ' and [' has elementary tests only. 
 
100 
 
 Let II. and IT' be loop- free programs that have elementary tests 
 y, equivalent to IT and IT' respectively. 
 
 Til is needed to transform the graph of IT to a tree. 
 
 "' In) ul *H n ' 3n i- 
 
 L and L are all the logical transformations required to transform II to 
 
 II. , and IT to II' respectively. It follows that II s II' . II and IT' 
 
 are loop- free programs with elementary tests only. Therefore by Theorem 8 
 
 n i (i,2,6,7,9,io,llf K i ' 
 n (1,2,6,7,9,10,11)1^ UL' 
 
 Thus I fno^^^-.^-m.. T in; > n ' 
 
 where L' are all the logical transformations required to transform IT' to 
 n ' . Let L = L U L' . Thus 
 
 IT n a, ■ / v ^ ' -^ -, -, 1, | T > IT ' where L is a set of logical trans- 
 ll,2,6,7,9,10,ll}UL 
 
 formations. 
 
101 
 
 8. OPTIMIZATION UNDER LOGICAL TRANSFORMATIONS 
 
 The object of optimization under logical transformations is 
 i) to eliminate nonexecutable paths, 
 ii) to minimize the number of tests performed. 
 
 Example 15 above showed how logical transformations may reduce 
 the cost of the program by eliminating nonexecutable paths. Minimizing 
 the number of tests performed is done by (a) simplifying the Boolean 
 expressions of the tests and (b) by breaking down the logical expressions 
 so that only the necessary minimum number of tests are performed. 
 
 EXAMPLE 19 
 
 The program IT has the following 
 
 l r 
 
 lec re S and S~ arc the left and right 
 
 subtrees, respective! . 
 
 The logical expression is completely determined if t is true, 
 
 and there is no need in this case to perform the tests t and t . Also, 
 
 2 3 
 
 if t 1 is false and one of tg, t, is false, there is no need to perform 
 
 other teat. Time 'jy transforming H. to n_ / 
 
 12 s 
 
 we reduce the number of tests performed in the following cases: 
 l) t-, is true 
 
L02 
 
 2) t.. is false and t is false. 
 In all the other cases, the cost of the program is unchanged. 
 
 If TT, has the test - 
 we first have to simplify the 
 Boolean expression by de Morgan's 
 law 
 
 \(t 2 JZ T ) 
 
 and now we apply the arguments above and break do T .-n the tests. We get 
 
 t, 
 
 Compilers that break down logical expressions in order to 
 accelerate the execution of logical statements have been constructed. 
 See Lowry and Medlock (7) and Huskey and Wattenberg (k)> 
 
 The definition of a reasonable cost for programs with 
 elementary tests is changed to include (ii) above. 
 DEFINITION 
 
 A cost function on programs that have tests is reasonable 
 provided that 
 
103 
 
 1) the cost decreases if statements which are never executed are 
 deleted from the program in such a way that statements are not added 
 to the program. 
 
 2) the cost decreases if a statement is deleted from some executable 
 sequence of the program. 
 
 3) the cost decreases if identical subgraphs and identical statements 
 are merged in such a way that tests are not added to the program. 
 
 k) the cost decreases if Boolean expressions of tests are broken down 
 so that the number of elementary tests performed is reduced. 
 
 Boolean Algebra theorems which are useful in simplifying 
 Boolean expressions can be used to eliminate nonexecutable paths. 
 Examples : 
 1. X-'XY = X is equivalent to 
 
 2. XY^XY = Y is equivalent to the tranc format ion 
 
 X X 
 
 T7,T11 
 
10U 
 
 In the following stages of the optimization the topological 
 transformations Til and T7 will transform the tree on the right to 
 a tree equivalent to the expression Y. 
 3. (XvY)Y = XY is equivalent to the transformation 
 
 X X 
 
 k. XYvXZ^YZ = XYvXZ is equivalent to 
 X 
 
 The transformations in l) - k) do not add statements to the 
 program. 
 
 The following theorem will give a procedure for optimizing 
 programs that have tests under reasonable cost functions. 
 THEOREM 13 
 
 There is an algorithm that finds an optimal program II ' for 
 a given program II that has tests. The algorithm operates in a series 
 
105 
 
 of steps so that FI is first transformed by logical transformations 
 to an equivalent program n n that has elementary tests and negations 
 only. Using logical transformations that consist of negations only, 
 TL. is then transformed to a program TT'. IT' is obtained by operating 
 the topological transformations on II ' . 
 PROOF : 
 
 We first assume that all nonexecutable paths can be 
 eliminated from IT in such a way that statements are not added to the 
 program. Thus in every optimal program each path is executable. Take 
 IT, to be a program logically equivalent to TT in which each path is 
 executable. 
 
 L i 
 
 where L-, are the logical transformations necessary to transform II into 
 
 TT, which are similar to T6 but operate also on negations of tests. 
 
 By the definition of reasonable cost functions every optimal 
 
 program has elementary tests and negations of elementary tests only. 
 
 That is because the number of tests performed is minimized if logical 
 
 expressions are broken down into tests that do not have the v and • 
 
 operations. Take TL. to be a program logically equivalent to II which 
 
 has minimum number of tests performed. TI is obtained by breaking 
 
 the logical expressions in TT-j_ into elementary tests and negations of 
 
 elementary tests. In some cases de Morgan's laws must be used so that 
 
 breaking the logical functions would be possible. TT-, > IL where 
 
 L 
 L„ is the set of logical transformations necessary to transform TT, 
 
 into II . 
 
106 
 
 There may be several programs equivalent to [L that have 
 
 elementary tests and negations of elementary tests only. For example 
 
 is equivalent to but also to 
 
 tjytp 
 
 Thus the process of constructing TL, is nondeterministic. Only a finite 
 
 number of ], can be reached, because breaking the expressions can be 
 
 done in a finite number of forms only. 
 
 In the above example the two possible trees have the same 
 
 cost provided there is no additional information on the tests. But for 
 
 some cases if we permute the terms we might reduce the cost of the 
 
 program. For example t-, (t-^t,) and (t_yt,)t, are the same functions 
 
 J- 2 9 d $ y 
 
 according to the definition of Boolean algebra but the corresponding 
 trees ^ t. 
 
 wv 
 
 (vV'i 
 
 are different and their costs are not the same. (The average number of 
 tests performed by the program on the right is greater. ) 
 
107 
 
 In order to eliminate checking all possible programs with 
 elementary tests equivalent to the given program, we would like to 
 have an algorithm that given a Boolean expression will find the minimal 
 tree corresponding to it. This problem is a special case of the problem 
 presented in Slagle (10). That paper is concerned with finding a 
 minimum-cost tree equivalent to the given Boolean expression where the 
 cost of applying each elementary test and the probability of its 
 outcome are given. When no information is given on the tests we may 
 assume that the costs of the tests are equal and the probability of 
 each test to be true is equal to its probability to be false. Thus we 
 get a special case of the general problem. 
 
 The algorithm finds a low cost tree for the given expression 
 in the following way: 
 
 Suppose an expression S is of the form S=S2^. ..^S . Each 
 permutation of the S- gives a different tree. To obtain a low cost 
 expression, expressions S. that have low cost and high probability of 
 
 J 
 
 being true should be carried out first. We will distinguish between 
 the following cases : 
 
 a) for a variable x. the expression that gives a low cost tree is x. 
 itself. 
 
 b) if an expression is of the form S,^. . .^S and none of S . is a 
 
 1 n j 
 
 disjunction then find for each S. a low cost expression R. and the low 
 
 J J 
 
 cost expression for S will be a permutation of the R. R v . . .^R such 
 
 that 
 
 c. c 
 
 J: < < -£ 
 
 p l " " P n 
 
 where c. is the cost of R^ and j>* is its probability to be true. 
 
108 
 
 c) if the expression is of the form G S....S and none of the 
 
 S . is a conjunction then find for each S . a low cost expression R . and 
 J J J 
 
 then permute the R. and get the expression R, R such that 
 
 c, c 
 
 -1 < < -a 
 
 where c. is the cost of R. and q.=l-p., p. is its probability to be 
 J J J J J 
 
 true. 
 
 Suppose we wish to find a low cost tree for x,^x.(x_^x, ) and 
 
 J- 2 3 4- 
 
 c.,p. are as follows: c. =(176,80,1*0,1+2), P i =(|. \'\> |)' We first 
 
 find low cost expressions for x, and x (x_^x, ). For x, we get x., 
 
 i 2 3 h- i i 
 
 itself. For x (x _^x. ) we find low cost expressions for x and x _^x. . 
 
 For x^x, we have 1*2 ^ 1*0 therefore the low cost expression is x.^x,. 
 3 h T "T * 
 
 2 k 
 Its cost is 
 
 c A +( V c 3 ) V3 +( V c 3H q 3 =c U +q U c 3 =b2 " 
 
 5 
 Its probability is p,+q,p =^-. 
 
 The low cost expression for x (x_ v x. ) is x (x. ^x ) because 
 
 c(x 2 ) _ go ^ c(x^ 3 ) _ £2 
 
 q(x ) 1 q(x,^x ) 3 
 2 ^ ^ 8 
 
 The cost of 
 
 is 111 and its probability is -=^- 
 
109 
 
 The low cost expression for x.^oc (x ^x, ) is x.vx (x.vx ) 
 
 17§ < 111 
 because ^^ <- ^^ • 
 
 2 iS 
 
 If no information is given on the tests, i.e. we assume 
 c. =1 P-=p for al l I* then the low cost expression for x (x v x. ) 
 can be either x (x vx ) or x (x, v x ) but not (x v x, )x because 
 
 ju c , (Xg ? < c t ( yv _ x i 
 
 1 q(x ) ajx~3x77 
 
 2 d ^ 4 U 
 
 For the case of disjunctions of conjunctions of singly 
 occuring variables and for the case of conjunctions of disjunctions 
 of singly occuring variables the algorithm above is proved to find a 
 minimum cost tree. Although this fact cannot be extended to the 
 general case, an exhaustive search in the general case may be made 
 more efficient by using the above algorithm. 
 
 Several improvements can be made to Slagle's algorithm 
 concerning extending his results to variables appearing more than once 
 
 i) as we showed above some cases of variables occuring more 
 than once can be eliminated by getting rid of nonexecutable paths. 
 This holds for 
 
 A-, • • • A A_ • • • A I _ • • . x = A , • • • A 
 
 1 nl nl m 1 n 
 
 we eliminated the double occurance of X ...X . This is done by 
 
 eliminating nonexecutable paths of the program. This is also true for 
 
 X, ...X (X,...X -Y, ...Y ) = X,...X 
 1 n 1 n 1 m 1 n 
 
 XY, . . . Y ^XY, . . . Y = Y, . . . Y„ 
 
 1 n 1 n 1 n 
 
110 
 
 ii) the algorithm applies also to those cases of 
 disjunctions of conjunctions of variables appearing more than once 
 that can be expressed as conjunctions of disjunctions. (e.g. 
 
 We might conclude that the algorithm above might be used to 
 get IT n quicker, either in its general form when the cost of applying 
 each elementary test and the probability of its outcome are given, or 
 in the special case of the algorithm when no information is given. 
 
 !! 
 
 L 1 UL 
 
 »M r 
 
 n a il 
 
 We define L = L U L 
 
 IL and TT' both have elementary tests and negations only, and 
 are equivalent by a set of transformations composed of topological 
 transformations and logical transformations that consist of negations 
 only. Logical transformations that include negations might expose 
 identical subgraphs. Example: 
 
 t(A) 
 
 Also logical transformations that include negations only might make 
 flipping of tests possible. Therefore we will transform n to a 
 program II' equivalent to II by logical transformations only, such that 
 II' is equivalent to H' by topological transformations only. If IT is 
 not known all choices for H' must be considered, thus II' is constructed 
 nondeterministically. There is only a finite number of possible H' 
 
Ill 
 
 but some algorithm or heuristics based on the nature of the cost 
 function might get IT' quicker. (In some cases all TI' have the same 
 cost, but the costs might be different for some cost functions. For 
 example, the cost of the test for zero might be less than the cost of 
 the test for a non-zero variable. ) A good algorithm will construct 
 TIq such that tests checking the same values will have the same logical 
 value, so that applications of T8 and Til (if needed) might be possible. 
 
 n ' > 
 
 ; (l,2,3A, 5,7,8, 9,10, 11) 
 
 The algorithm to obtain an optimal program IT 1 from 1' by- 
 using topological transformations only is the same as in Theorem 10. 
 
 Since we showed that breaking down the tests into elementary 
 tests reduces the cost of the program, a different algorithm from that 
 of Theorem 15 could be designed which first breaks down the tests as 
 to get some equivalent program with elementary tests only, and then 
 operates the topological transformations in order to get an optimal 
 program. The algorithm of Theorem 10 cannot be used here because it 
 does not find a program with the minimum number of elementary tests. 
 Obtaining an optimal program with the minimum number of tests by using 
 topological transformations only is more complicated than Slagle's 
 algorithm since it involves a process of adding new tests and new 
 nonexecutable paths to the program and eliminating other tests that 
 become useless and paths that become nonexecutable. 
 
 We showed how Boolean manipulations of the tests eliminated 
 nonexecutable paths and therefore reduced the cost of the program. 
 
112 
 
 Optimizing the program by eliminating nonexecutable paths can also be 
 done when we have additional information on the tests. As an example, 
 we will show how the use of the test for equality may eliminate 
 nonexecutable paths. If we have the program 
 
 A - cpCD 
 B <- cpCD 
 
 v(A) - v(B) and the right path can never be executed. 
 
 This case is similar to the case in which algebraic 
 identities are known to hold among operators. Here the additional 
 information is on the tests. Here the model of program schemata is 
 as before, only test statements include tests instead of test names. 
 The following cases are possible: 
 
 (i) 
 
 ( X-, , . • • X , Y, , . . . Y ) 
 v ±' n' 1' n 
 
 v'(X. ) = v (Y.) 1 < i < n. 
 l i — — 
 
 The marked path is nonexecutable and can be eliminated. 
 (ii) 
 
 »L»-V =S(V-V 
 
 where v (X ± ) = v £ (Y ± ) 1 < i < n, and g is any function. The marked 
 path is nonexecutable. 
 
113 
 
 (iii) 
 
 P(X 1 ,...X n )4=*P(Y 1 ,...Y n ) 
 
 v (X^ = v (Y i ) 1 < i < n, and P is a proposition. 
 
 =(X-| ,...X , Y, , . . . Y ) 
 Vr n 1' n 
 
 (iv) 
 
 g(X 1 ,...X n )=g(Y 1 ,...Y n ) 
 
 g is any function. Here the values of X. and Y. might be different 
 for some i (i.e. the expressions are different) but the vectors (X.) 
 and (Y. ) might be equal under some interpretation. The marked path is 
 
 nonexecutable, 
 (v) 
 
 P(X 1 ,...X n ) = P(Y 1 ,...Y n ) 
 
 =(X, ,...X , Y,,...Y ) 
 Al n' 1' n 
 
 P is a proposition. 
 
 All these transformations preserve equivalence of programs. 
 A scheme for optimization similar to that of Theorem 15 can be 
 constructed for the cases above. 
 
111+ 
 
 PROGRAM SCHEMATA WHICH ALWAYS HALT 
 
 The equivalence problem for the class of program schemata 
 which always halt is decidable, although membership in this class is 
 not. (See Luckham, Park and Paterson (8)). 
 
 The decision procedure of (8) is based on the fact that for 
 each program that halts under all interpretations we can construct 
 effectively an equivalent loop-free schema. We unwind the loops, 
 discarding any paths which are nonexecutable. If this process continues 
 indefinitely, the hypotheses of Koenig's lemma are satisfied, implying 
 that there is an infinite executable path and thus the program diverges 
 under some interpretation. Therefore for programs which always halt 
 this procedure must produce an equivalent, finite, loop-free program. 
 Thus the decision procedure of Theorem 2 applies also to programs which 
 always halt. 
 EXAMPLE 20 
 
 Produce an equivalent loop-free program for a program which 
 always halts. 
 II: 
 
 L 2 <- FL 2 
 
 STOP 
 
115 
 
 .s an equivalent loop-free program: 
 
 L , - FL 
 
 STOP t(L x ) 
 L 2 -FL 2 
 
 STOP 
 
 v(L 1 )=FL 1 
 
 STOP v(L )=FFFL 
 
 STOP 
 
 v(L 1 )=FFL 1 
 
 STOP v(L )=FFFFFL 
 
 v(L 1 )=FFFL 1 
 
 The last test statement t(L, ) tests the same value as does the statement 
 marked by (*). Therefore the left path is nonexecutable and the procedure 
 stops. We get an equivalent loop-free program. 
 
 All the transformations that operate on loop- free programs can 
 be used to simplify programs which always halt. We will denote by T12 
 the transformation that maps programs which always halt to loop-free 
 programs. Because T12 eliminates nonexecutable paths, every program 
 which always halts is transformed by T12 to a loop-free program in which 
 each path is executable. 
 THEOREM 16 
 
 Let Y. and 1 ' be two programs which always halt. Then 
 
 n 
 
 (1,2,6,7,9,10,11,12) 
 
 ¥ ir 
 
116 
 
 The proof is obvious by using the results of the previous 
 chapters . 
 
 By using T12 first, and then applying the procedure of 
 Theorem 10 for optimizing loop-free programs, we might optimize 
 programs which always halt, under reasonable cost functions. The 
 following transformations that operate on programs with loops only, 
 and are equivalent to sequences of transformations of T1-T12, might be 
 used by the optimizing process. 
 T13 Moving loop- independent statements out of the loop 
 
 Let S-: A *- cpB, . . . B be a statement in a loop such that 
 B , . ..,B are not defined by any statement in the loop. Then 
 
 (1) if A is not referenced by any statement in the loop S. can be 
 moved out of the loop either backwards or forwards. 
 
 (2) if A is referenced by some statement S. in the loop S. can be 
 
 moved backwards out of the loop. 
 
 S. : A*-cpB n . ..B„ 
 1 .1 r 
 
 PROPOSITION 1 
 
 
 If .1 
 
 (13) 
 
 =^IT, then IT 
 
 (1,2,U,12J 
 
 > IT • 
 
 PROOF: 
 
 T12 
 
 A-cpB 1 ...B r 
 
 A«-cpB . . . B 3 
 
 !...B r 
 
117 
 
 By T12 we get a corresponding loop-free program. 
 
 (1) If A is not referenced by any statement in the loop, all the 
 statements except the last are useless and can be eliminated by several 
 applications of Tl. The last statement can be flipped forwards by 
 applications of TU so that it will be moved out of the loop. 
 
 (2) If A is either referenced or not by statements in the loop, we 
 might use T2 several times to remove the redundant definitions of A, 
 so that only the first definition is remained. Then we use Tk to flip 
 the first statement backwards to move it out of the loop. 
 
 The reverse of T12 is used to transfer the loop-free program 
 back to a loop program. 
 Tl!; Merging of identical subgraphs which include loops 
 
 T1U 
 
 We assume that the loop 
 
 always halts 
 
 PROPOSITION 2 
 
 If 
 
 {I'-O 
 
 b> ' , then n 
 
 • 
 
 {11,121 
 
118 
 
 PROOF: 
 
 T12 transforms IT to a loop-free program, then Til merges the 
 identical subgraphs and the reverse of T12 transforms the loop-free 
 program back to a loop program. 
 T15 Elimination of nonexecutable loops 
 
 STOP 
 
 The loop can never be executed. 
 PROPOSITION 3 
 If JI 
 
 > IT', then II > IT' 
 U5T (12T 
 
 PROOF : 
 
 The procedure that transforms a program which always halts 
 to a loop-free program, eliminates nonexecutable paths. 
 Tl6 Merging of loop-free code with loops 
 
 Tl6 
 
 t 3 (L 2 ) 
 
119 
 
 The two programs give the same loop-free program. The program on the 
 
 right is more efficient. 
 
 PROPOSITION h 
 
 If IT > IT ' , then II =^=^ II ' . 
 
 16 12 
 
 : 
 
 T12 transforms ^ to a loop- free program, which is transformed 
 [' by the reverse of T12. 
 
 The optimizing process uses T13-T16 together with the 
 transformations Tl-Tll. 
 
 T13 might he used to reduce the program. 
 
 T15 is used in the stage where nonexecutable paths are 
 eliminated from the program. 
 
 Til* is used whenever Til is used to merge subgraphs of the 
 program. In this stage we can also apply Tl6 to merge as many loop-free 
 codes with loops as possible. 
 
120 
 
 REFERENCES 
 
 (1) A. V. Aho, and J. D. UlLman, "Transformations on Straight Line 
 
 Programs, " Conf . Record Second Annual ACM Symposium on Theory 
 of Computing, pp. 136-1A8 (May 1970). 
 
 (2) F. E. Allen, "Program Optimization," Annual Review in Automat i 
 
 Programming, Vol. 5 (1969). 
 
 (3) J. T. Baywell, "Local Optimizations," Proc. of a Symposium on 
 
 Compiler Optimization, pp. 52-66 (July 1970). 
 
 (k) H. D. Huskey, and W. H. Wattenburg, "Compiling Techniques for 
 
 Boolean Expressions and Conditional Statements in ALGOL 60, " 
 Comm. of the ACM, Vol. k, pp. 70-75 (l96l). 
 
 (5) IBM System/360 Operating System FORTRAN IV Programmer's Guide, 
 
 SLR Number C28-6817. 
 
 (6) D. E. Knuth, The Art of Computer Programming, Vol. 1, Addison- 
 
 Wesley Publishing Co. (1969). 
 
 (7) E. S. Lowry, and C. W. Medlock, "Object Code Optimization," Comm. 
 
 of the ACM, Vol. 12, pp. 13-22 (1969). 
 
 (8) D. C. Luckham, D. M. R. Park, and M. S. Paterson, "On Formalized 
 
 Computer Programs," J. Comp. and System Sciences, Vol. k, 
 pp. 220-2^9 (1970). 
 
 (9) M. S. Paterson, "Program Schemata," Machine Intelligence, Vol. 3, 
 
 pp. 19-32 (1968). 
 
 (10) J. R. Slagle, "An Efficient Algorithm for Finding Certain Minimum- 
 Cost Procedures for Making Binary Decisions, " J. of the ACM, 
 Vol. 11, pp. 253-264- (196*0. 
 
121 
 
 VITA 
 
 The author, Nurit Bracha, was born in Tel-Aviv, Israel. She 
 received her B.A. degree in Mathematics and Statistics from the Hebrew 
 University of Jerusalem in 1965, and her M.S. degree in Computer Science 
 from the University of Illinois in 1970* 
 
 From September I968 to January 1972, she worked as a research 
 assistant in the Computing Services Office of the University of Illinois. 
 She is currently a research associate with the Department of Computer 
 Science and the Computer-based Education Research Laboratory, University 
 of Illinois. 
 
 She is a member of Sigma Xi. 
 
BIBLIOGRAPHIC DATA 
 SHEET 
 
 1. Report No. 
 
 UIUCDCS-R-72-516 
 
 2. 
 
 3. Recipient's Accession No. 
 
 4. Title and Subc itle 
 
 TRANSFORMATIONS ON LOOP-FREE PROGRAM SCHEMATA 
 
 5- Report Uatr 
 
 June, 1972 
 
 6. 
 
 7. Amhor(s) 
 
 Nurit Bracha 
 
 8. Performing Organization Rept. 
 No. 
 
 9. Performing Organization Name and Address 
 
 Department of Computer Science 
 
 University of Illinois at Urb ana -Champaign 
 
 Urbana, Illinois 61801 
 
 10. Project Task/Work Unit No. 
 
 11. Contract /Grant No. 
 
 12. Sponsoring Organization Name and Address 
 
 Department of Computer Science 
 
 University of Illinois at Urbana-Champaign 
 
 Urbana, Illinois 61801 
 
 13. Type of Report & Period 
 Covered 
 
 14. 
 
 15. Supplementary Notes 
 
 16. Abstracts 
 
 A formal theoretical approach toward code optimization is considered. 
 A program schema that models loop-free programs is presented and a complete 
 set of equivalence preserving transformations on loop-free programs is 
 found. A scheme for optimization is provided in which a sequence of these 
 transformations is applied to get an optimal code. These results are 
 extended to the model of loop-free programs which assumes that certain 
 types of algebraic laws hold among the operators, and also to the case in 
 which the tests are Boolean functions of elementary tests. 
 
 17. Key Vl'ords and Document Analysis. 17o. Descriptors 
 
 Program schemata, loop-free program, equivalence problem, 
 transformations, optimization. 
 
 17b. Identifiers /Open-Ended Terms 
 17c COSAT1 Field/Group 
 
 18. Availability Statement 
 
 19. Security Class (This 
 Report) 
 
 UNCLASSIFIED 
 
 21. No. of Pa. 
 
 Unlimited 
 
 
 20. Security Class (This 
 Page 
 
 UNCLASSIFIED 
 
 22. Price 
 
 FORM NTIS-33 110-701 
 
 USCOMM-DC 4032B-P7I 
 
JUL 
 
 
ffl Warn m 
 
 tttBttfl B 
 
 ffl HUH Wi? 
 
 ^™ ^H 
 
 
 ■B 
 
 ' ■■ ■■ ■ a 
 
 Din 
 
 h h ^ 
 
 
 ■mbhi Ens 
 
 hBBUmM ' 
 
 H SHU 
 
 111 n >hn 
 
 ■1 
 
 ■ 
 H Si 
 
 D 
 
 HH 
 
 IBM HH ■■ 
 
 Bra 
 
 IfiS rami m 
 
 HHH 
 
 iliffl Nil 
 
 (rbb i^^H BUS BWM lull 
 
 m llg 
 
 RwcloVxSaKfilla