mBBSBaSmm WBSSmSm iiWB sBBSS3SSSSSS& Wmm. BBMMmjj ■ JHnHlBWBIifOidAi mm Wism HHIlllll HrafiBn HHUiHllBSSU h& WRRKflBSB&B MHwiK ■nnnHMmiin BMB B BB :iU> ' J 1 1 lulllillf OH •«y. WM Hum! r HWImiMffliMnBi ffjSJf H nlHlHIsffi Instil ISa HHrawQuOOOaMi HBnl BHI3QB 111 llllifri wmi wKm WRBERmmm LIBRARY OF THE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN E>LQ.%4 cop. 2. Digitized by the Internet Archive in 2013 http://archive.org/details/parallelismexplo518kras 7. *L UIUCDCS-R-72-518 7?/^S£ r PARALLELISM EXPLOITATION AND SCHEDULING by Paul W. Kraska June, 1972 DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN URBANA, ILLINOIS "HE LIBRARY OE IHES UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN* DCS Report No. U I UCDCS-R-72-5 I 8 PARALLELISM EXPLOITATION AND SCHEDULING by Paul W. Kraska University of Illinois at Urbana-Champaign Urbana, I I I inois 61801 June 1972 This work was supported in part by the National Science Foundation Grant No. US NSF GJ 27446 and was submitted in partial fulfillment of the require- ments for the degree of Doctor of Philosophy in Computer Science in the Graduate Col lege of the University of I I I i nois, 1972. PARALLELISM EXPLOITATION AND SCHEDULING Paul Wi I I iam Kraska, Ph.D. Department of Computer Science University of Illinois at Urbana-Champai gn, 1972 Parsing algorithms are developed such that syntactic tree-height is minimized, or reduced, with respect to application of the associative, commutative, and distributive (but not factoring) laws of arithmetic, on arithmetic expressions composed of well-formed sequences of the symbols add, subtract, multiply, divide, scalar identifier, and assign- ment, where it is assumed that a unique weight (i.e., program execution time) may be associated with each symbol. A parsing algorithm is also developed such that syntactic tree-height is minimized, with respect to application of the associative law, on expressions composed of a con- formable sequence of matrix products, where the matrices are not nec- essari ly square, and such that the overall number of computer opera- tions is minimized if tree-height is not affected. A new non-preemptive scheduling algorithm of a weighted-node acyclic dependency graph, having n nodes, on a system of k equi potent machines is developed such that all nodes are processed in the least 2 amount of time. The assignment process requires 0(n ) computer oper- ations. A new algorithm to determine k* = LUB(k) , such that the graph may be processed in the critical time, is also presented. Finally, it . is shown how to optimally schedule a system of m special purpose machines, where there are k. equi potent machines of each type, on a graph. ACKNOWLEDGEMENTS I wish to express my deepest gratitude to my advisor, Professor David J. Kuck of the Department of Computer Science of the University of Illinois for his guidance, suggestions, imagination, and role as the devil's advocate. I also wish to thank Jay J. Wolf of the Burroughs Corporation, Paoli, Pennsylvania and Arthur B. Carroll, formerly of the Illiac-IV project at the University of Illinois, for making it possible for me to pursue an education at the University of Illinois. Finally, I'd like to thank my wife, Sally, and my children, Jenny, Becky, Matthew and Eric, for being patient. IV TABLE OF CONTENTS Page 1 . I NTRODUCT I ON I 1.1. History of Parallel Computation I 1.2. Scope 3 1.3. Reader's Guide to Chapters 2, 3, and 4 8 2. ARITHMETIC EXPRESSION PARSE TO REDUCE SYNTACTIC TREE-HEIGHT. 12 2.1. I ntroduction 12 2.2. Notation 14 2.3. Tree-height Minimization of Monolithic Sums and Products 21 2.4. Tree-height Minimization of Monolithic Expressions With Division 24 2.5. Distribution 28 2.6. Conclusion 48 3. TREE-HEIGHT REDUCTION FOR A SEQUENCE OF MATRIX PRODUCTS 50 3.1. I ntroduction 50 3.2. Scalar Matrix Product Subexpressions and Canonical Form 51 3.3. Conditions for Tree-height Minimization 54 3.4. The Sequence of Matrix Products Parsing Algorithm 62 4. SCHEDULING FOR A WEIGHTED-NODE DIRECTED GRAPH 66 4.1. I ntroduction 66 4.2. Lower Bound on the Number of Machines Required 67 4.3. Scheduling Algorithm for k Machines 71 V Page 4.4. Computational Complexity of Algorithm a 79 4.5. Least Upper Bound on the Number of Machines Requi red 84 4.6. A Fast, Non-optimal Scheduling Algorithm 85 4.7. Multiple Special Purpose Machines 90 APPEND I X — TREE HE I GHT-REDUCT I ON PROGRAM 98 LIST OF REFERENCES I 30 VITA 133 I . I NTRODUCT I ON I.I. History of Parallel Computation Although Charles Babbage first proposed devices which would achieve parallel computation nearly 150 years ago in his Analytic Engine (multiple arithmetic units and memory look-ahead (25)), until quite recently computer manufacturers have been content to build machines which run in a basically sequential manner. They still exhibit a large amount of lethargy, notably IBM, in the development of parallel processing systems even though the computer community has recognized the need to build computer systems with processing power which ex- ceeds that possible by component speed-up advances. We have progressed from roller bearings and gears to transistors and our computers are still not fast enough to satisfy our needs. However, the absence of parallelism is not to be wondered at; up until now, the expertise to exploit parallelism efficiently has simply not been available. It is expected that the contents of this paper will shed new light on the utilization of parallel computing systems. During the past decade the first steps in parallel computation have been taken by Burroughs Corporation and by Control Data Corpora- tion. Their reasons for doing so were, however, quite different. Burroughs accurately recognized that in the conventional processing system (sequential) a large percentage of the machine at any instant of time was idle and performing no useful work. In order to increase subsystem utilization, and thus increase cost-effectiveness, Burroughs proposed and built the B-5000 in I960 which operated in a multi- programming/multiprocessing mode. Control Data, on the other hand, was not concerned with system cost, but with system speed. They de- veloped a machine which could perform several arithmetic operations as well as memory operations simultaneously, with an ad hoc para I lei ism detection device controlling the functional units operating in parallel, and thus at least double the amount of work performed over that of a comparable conventional system. This design concept also increased the cost-effectiveness. The CDC 6600 was introduced about 1965. While the market place for these systems is quite different, let us place them in the "general purpose computer" class, for the design concepts have since been copied by many other manufacturers. We should also mention a very special kind of parallel computer system class which may be available for the first time this year (1972). The Control Data STAR computer, a pipe- line vector machine, and the University of Illinois ILLIAC IV (built by Burroughs), an array of uniform processors, are capable of performing a large number of identical operations (e.g., add) simultaneously. Texas Instruments is also building an Advanced Scientific Computer (ASC), which is re- markably similar to the STAR computer. While these machines perform admirably on a special class of problems, their parallelism is not easily exploited if the processing algorithms do not require many simultaneous operations of the same type, or if the input data is not well-ordered, although storage techniques have been developed to amel- iorate the latter problem (19). Despite this recent appearance of concurrent processing, none of the computer manufacturers, or users, has been successful in exploiting the existing, or proposed, machines to achieve the parallel computa- tion inherent in the machine design. It is still necessary for some clever programmer, applying some heuristic algorithms which are intu- itive only to him, to design a program which exploits the parallelism. 1.2. Scope This paper attacks two small, but not insignificant, problems in the field of parallel computation. Firstly, we show how to modify and parse an arithmetic expression such that syntactic tree-height is min- imized, given that the computer operations (e.g., fetch, add, multiply, ...) each require a specific amount of time, not necessarily equal, during program execution. Other investigators have solved the tree- height problem assuming unit-time computer operations, but this is the first solution based on weighted operations. Clearly, syntactic tree- height tells us how quickly an expression may be evaluated by a system of parallel processors, given a sufficient number of processors. But what do we mean by "sufficient number of processors" and furthermore, what is the best way to use the processors to evaluate an expression in the minimum possible time? This is the second problem solved in this paper. We show how to schedule a system of equi potent processors, or machines, on a job, which may be described not only by a weighted-node tree, but also by a weigh ted- node, acyclic directed graph, where the scheduling is not preemptive, such that the job is completed in the least amount of time. (We leave the problem of finding common sub- expressions, and thus changing a syntactic tree to a graph to inves- tigators such as Breuer (5).) The scheduling algorithm is new, but not the first algorithm to optimally schedule a sufficient, or insuffic- ient, number of machines on a weighted-node graph. The scheduling 2 algorithm requires 0(n ) computer operations, where n is the num- ber of nodes, to make the assignments. We also show, in a new way, how to determine a lower bound on the number of machines required in order to process the job in the critical time of the graph. Finally we show how to optimally schedule a system of special purpose machines, with one or more machines of each type, on a weighted-node graph. We solve the problem of tree-height minimization in the following way. All arithmetic expressions, except those including a sequence of matrix products, are analyzed by application of the distributive law of arithmetic (but distribution is made only if tree-height is reduced), and the associative and commutative laws; a sequence of matrix products, where the matrices are not necessarily square, is analyzed by application of the associative law of arithmetic. This is also the first solution to the tree-height minimization problem of the product of a sequence of non-square matrices. The term "minimize" in the context of tree-height, has a very spec- ial meaning in this paper. Within the constraints of application of the distributive, associative, and commutative laws only, "minimize" has the usual meaning. Factoring is a valid arithmetic operation (i.e., the converse of the distributive law); however, since factoring is a fairly complicated process, it is not included in our analysis pertaining to tree-height reduction. Hence, if we were to permit the application of a factoring procedure also, then what we have found to be "minimal" is in fact not. For example, Pan (30) has given a lower bound on the syntactic tree-height of an nth degree polynomial, P (x) . The lower bound is probably only achievable when P (x) is n k r x n written in factored form, viz : P (x) = (x - x.)(x - x )...(x - x ) . n l z n In the syntactic tree of this expression, n subtractions could be performed simultaneously to evaluate all factors, and then the multi- plications could be performed in log^(n-l) steps to complete the evaluation of P (x) . Muraoka (28) and Maruyana (24) have found n y methods which were redundantly found by Munro and Paterson (26), to compute polynomials such that syntactic tree-height is reduced without the necessity of finding the polynomial roots. However, these syn- tactic trees are not achievable from Horner's polynomial form by the methods described in this paper without occassionally factoring out common subexpressions. The history of tree-height reduction, and hence parallelism expos- ure, in arithmetic expressions is not slight. Baer and Bovet (2 ) and Muraoka (28) develop methods to reduce syntactic tree-height using the three laws of arithmetic by assuming that all operations require unit time for evaluation. While Muraoka and Kuck (29) recognized that a weight could be given to a matrix product, their tree-height reduction algorithm only deals with extended matrix product expressions involving square matrices and vectors. It is believed that Hao (10) was the first to study the implications to syntactic tree-height of arithmetic expressions when multiplication is a >_ I times as costly (in time) as addition. We solve the problem of optimal assignment of weighted nodes from a graph to a parallel system of k machines by judicious application of the Critical Path Method principles, as described by Kauffman ( 18). During the assignment process, the critical path of the graph is dynamic, and at any instant, nodes on the current critical path are given an assignment potential greater than nodes off the cri- tical path. A sufficient, but not necessary, condition that the algorithm is indeed optimal is that all node-weights are of the same order of magnitude. This condition appears in the scheduling algorithm as a one- level look-ahead; if the node-weight range is more than an order of magnitude, to ensure optimal ity it would be necessary to incorporate a mu I ti p le- leve I look-ahead in the algorithm. Scheduling problems are as old as the industrial revolution. By the law of entropy, our search for more efficient machine utilization has not diminished with increased technology. In recent times, this interest is demonstrated by the development of the PERT system for the U.S. Navy's POLARIS project. Johnson (]3) has studied many classic models in the field of scheduling. Conway, et a l t ( 6) have studied probabilistic systems and queuing theory. Muntz and Coffman (27) have studied pre-emptive scheduling. Hu (12) was the first to give an opti- mal scheduling algorithm for a job described by a unit-node tree. Schwartz (34) presented a non-optimal scheduling algorithm (but useful nonetheless) for a job described by a weighted-node graph. Ramamoorthy, et a I, (31) were the first to give an optimal scheduling algorithm, using a sufficient number of machines, for a job described by a weighted node graph. Their results were developed independently of this author, however. It is evident that the interest in scheduling problems is alive, and will certainly continue. Let us reflect, for a moment, on the implications of the develop- ments in tree-height reduction and scheduling, as given in this paper, on compiler and machine organization. Clearly, a source-code program must be analyzed at compile time so that arithmetic expressions may be parsed to minimize syntactic tree-height. Similarly, the scheduling algorithm must be exercised on the object-code before run-time. Whether a machine has a system of parallel processors which are governed by an ad hoc scheduler (as the CDC 6600) or by a tag-directed method (as the IBM 360/9 I ) , is immateri a I ; it wou I d be poss ib le to use the f u I I po- tential parallelism of either system if proper analysis is made. Also, since it is now known how to utilize a parallel system, we need not be afraid to build them. In other words, we can study programming prob- lems and answer questions like: "How big should the parallel system be, and how should the processors communicate with one another?" 1.3. Reader's Guide to Chapters 2, 3, and 4 In order to facilitate reading this paper and to indicate the development in each chapter, we include the following reader's guide. Chapters 2 and 3 present algorithms to minimize syntactic tree-height of scalar arithmetic expressions and a sequence of matrix product expressions, respectively. Chapter 4 presents the scheduling algo- ri thm. The notation used throughout Chapter 2 is presented in section 2.2. The important result in this section is Theorem 2.1, where it is proven that the described discrete combination method minimizes tree- height. A concise and clear example of discrete combination, using the developed notation, is given at the end of the section. Section 2.3 shows how to minimize monolithic expressions of the form: E = £t. and E = Hff. , and section 2.4 shows how to minimize mono- lithic expressions of the form E = |ff./f . Ample examples are given in each section. Section 2.5 is quite lengthy, but is logically developed. First we investigate expressions of the form E = (£t.)*f and E = (£t.)/f , where the terms are not monolithic, and show how to minimize tree- height through distribution of a monolithic factor f over the sum of terms. We also prove that we need only consider a monolithic factor f for distribution. There are two conditions for distribution to reduce tree-height; the second case is applicable for expressions of the forms E = Jt. + (Yt!)*f or E = Yt. + (Yt')/f . We then show how distribution can be used to minimize non-monolithic expressions of the forms E = £t. and E = "]~f. . (Note that "minimize" here is used without considering factoring of common subexpressions; e.g., the 5 6 7 syntactic tree-height of E = ax + bx + ex is larger than the tree- f 2 5 height of E = (a + bx + ex )x .) Finally, we give an algorithm to evaluate expressions of the form E = (J]~f . ) /(J]"f ' ) such that tree- 1 J height is reduced. Note that no claim for minimized tree-height is made in this case; the numerator and denominator may be i n so many forms that it is difficult to claim minimality. If we allow subtrac- tion to freely replace addition in these expressions, it is clear that any scalar arithmetic expression may be found among these presented, except a continued fraction. A sequence of matrix products, where the matrices are not nec- essarily square, may also be parsed such that syntactic tree-height is minimized. This is the subject of Chapter 3. In section 3.2 we show how to recognize matrix subsequences which reduce to a scalar; since scalars commute with matrices, we remove these subsequences to obtain the canonical form of matrix sequences. In section 3.3, the conditions for tree-height minimization are presented. We first show that matrix products are similar to scalar products (except that the commutative law does not hold) and we prove, in Theorem 3.1, that tree-height is minimized if and only if slack is minimized. We then show that when the sequence is a product of three transformations, i.e., 3 E = Ti , special consideration must be given to minimize tree- 1 = 1 height as well as the number of computer operations. Finally, we show how to associate the transformations to ensure that slack is minimized and to handle the special boundary conditions concerning the transfor- mations at either end of the matrix sequence. Section 3.4 is the parsing algorithm which is based on the principles developed in Section 3.3. In section 4.1 of the scheduling chapter, we lay the ground rules for the type of oriented, or directed, graph we consider for scheduling, In section 4.2, Theorem 4.1 establishes a lower bound (LB) on the number of machines required to process the weighted nodes in the criti- cal time of the graph (i.e., the longest path between the set of starting and the set of terminal nodes). We also discuss the number of computer operations required to determine the LB. In section 4.3 several lemmas establish the optimal ity of Algorithm a for the assignment of the weighted nodes from the graph to a system of k parallel, equi potent machines. In section 4.4, we show that the over- all number of computer operations required to make the node assign- 2 ments under Algorithm a is 0(n ) . Section 4.5 shows how to use Algorithm a from section 4.3, and the LB, from section 4.2, to es- tablish a least upper bound on the number of machines required 1 1 (i.e., sufficiency) to process the graph in the critical time. The last two sections discuss a non-optimal, but faster scheduling algorithm and how to use Algorithm a to optimally assign nodes to a system of nonequ i potent, parallel machines, respectively. 12 2. ARITHMETIC EXPRESSION PARSE TO REDUCE SYNTACTIC TREE-HEIGHT 2. I . I ntroduction Recognition of parallelism within an arithmetic expression having distinct operator weights in order to produce reduced height in the syntactic tree is discussed in this chapter. The time that a system of parallel processing elements requires to calculate an arithmetic expression is governed by its tree-height. Other investigators have developed parsing algorithms to achieve this end by assuming that all operations require unit time. Baer and Bovet (2) have successfully reduced tree-height of an arithmetic expression containing addition and multiplication, by using the associative and commutative laws. They used a stack mechanism to parse a + b + c + def + g + h into (((a + b) + (c + g)) + (d e) f) + h It is easily seen that four steps are required on a parallel machine. Muraoka (28) added the distributive law to the work of Baer and Bovet. Two situations permit distribution: a) when holes in the tree are present, e.g. , 13 A = a(b c d + e) is distributed as A c ' = abcd+ae | \/ | I \/ \/ \/ o * o o * * * \ \/ / V / O * O v o \ \/ \/ o + \/ * id where A is executed in four steps wh i I e A is executed in three steps on a parallel machine; b) when space in the branches are present, e.g., A = a(b c + d) + e is distributed as A =abc+ad+e \ V I / o * o c \ V / o + o \/ / * o \/ \/ 1 \/° \l 1 V \. + where the height of A is 4, while the height of A is 3. Integer sets identify the existence of free nodes on the tree; set operations are used to determine if the proper conditions exist to distribute. When the operators do not have unit weight, then the theories of Baer-Bovet and Muraoka for expression parsing such that tree-height is reduced are no longer valid. Observe, for example, when, add-weight, vi - 2 and mu I ti p I y-weight, w = 3, that one way to parse E=a+b+c+ def + g + h such that tree-height is reduced is: (c.f. above) 14 E' = (((a + b) + (c + g)) + h) + (d e)f 0- In this case, the minimum tree-height of E is eight whereas the tree- height of the Baer-Bovet parse with these operator weights is ten. Clearly, any parsing scheme which reduces tree-height is a function of the grammar as wel I as the operator weights. We have already used the terms "minimize" and "reduce", and will continue to use them throughout this chapter; the reader should not confuse their meanings. 2.2. Notation Certain mathematical tools are necessary in order to perform the expression analysis with weighted operators. The notation used here is evolutionary from that of Muraoka. Let hCEH be the minimum tree-height of E, relative to the reduction algorithm, and let e[_E^\ - 2 n LEJ be the effective length of E. Also let h CeH be the g-height of E, where a x a,m,d for add, multiply, or divide respectively, w be the a-operator h [E] a wei ght , and e [E[] = 2 be the effective a- length of E . Def i nition 2.1: h [E] ='i£S That is, no matter what E is, we can find out how many a-steps would be 15 required if E consisted of a-comb i nations of atoms (i.e., simple var- iables) alone. For example if hCEH = 8 and w = 3, then h L"ED = -2- or 2 that 2 — multiply steps would be required to achieve a tree-height of Definition 2.2: w is an integer. a There is no loss of generality with such an assumption. Lemma 2.1 : hF_El] and e[_Ej are integers. Since the a-operations are discrete, it follows that h^El] and hence eL~EH are integers, which is consistent with Definition 2.2. l/w a Lemma 2.2 : h TeI] is a rational number, and e FEU = (eCEH) is a real number. The fact that e CeH is real poses some problems. Some of the subse- a quent analysis requires an evaluation of £e CE-ll , where the E. are subexpressions of E . Since these numbers are real, their summation on a finite word-length machine will lead to errors. Furthermore, it would be nice to avoid evaluation of fractional exponential functions as much as possible. The following analysis permits a transformation of these reals to the integer domain. Let (Z,+) be the additive group of the integers Z . The residue class 0.. = {..., -2w , -w , 0, w , 2w ,...} is also an additive a a' a a a 16 group. Let qO = { . . . , -2w + q, -w + q, q, w +q, 2w +q,...} be the coset defined by q e Z . Then if X = {xlx = V2 q ' /wQt , q e qO q • * ' H i M a for some q} , it is easy to show that (X ,+) is a group which is isomorphic to (Z,+) . The q which are restricted by < q < w — a define all possible cosets (or residue classes) and only these q are used. (X ,+) is the group generated by the q coset; they are called generically, the "coset groups." Note that each e a EEH e (X ,+) for some q . To evaluate £e CE.H ordinary addition may be used to combine any two elements from the same coset-qroup. That is if e TE-Il e (X , + ) and a r a i q' e [E ; D e (X , + ) then e „CE T ] + e [E -J e (X . + ) . However, if two «* J q *-* i ot J q elements are in different coset-groups, then some other method of com- bining them must be performed. Definition 2.3 : Let T or + mean summation within coset-groups. Definition 2.4 : Let i ~j mean summation between coset-groups viz: starting with the smallest coset-group element (i.e., the smallest among the reals), add it to the next larger element as if they were equal in size performing the + summation where necessary, recursive- c ly, until a single 2's power number remains. The syntactic analysis in the following paragraphs develops trees out of subtrees by discrete combination. Suppose subexpressions E. and E„ are being combined by some operator a . Then hllE. a E H = w + max(hHE J,hf_E I]) is a discrete combination of E. ^ a z 17 and E In terms of effective length, eCE, a El = w +max(h[E,],h[Ej) max(h [E,],h [Ej 2 a ' " 2 and e CE, a E ? : = 2(2 a a 2 ) . Def- a I ^ initions 2.3 and 2.4 are used to perform this discrete combination; in Definition 2.4, a discrete summation is used whenever the terms are of unequal size. In the following analysis of discrete summation, all literals are positive integers. P., P., Def i ni tion 2.5 : The discrete summation of 2 l/r and 2 J' r is 2*max(2 ' /r ,2 j/r ) . Clearly, when p. ^ p. there is a certain amount of waste when i i P i/r P i/r the discrete summation of 2 and 2 J is performed. This waste is ca I led si ack. P / P . Definition 2.6: If p > p we let slack be s = 2 k/r - 2 J /r . P k- P j j,k Hence, the discrete summation may also be expressed in terms of Pj, P:/ r slack. If p. > p. , then the discrete summation of 2 and 2 J is 2 Pi/r + 2 P J' /r + s. . = 2 ]/r + 2 P J/ r + (2 Pj/r - 2 P ' /r ) = 2(2^) . It is desired to perform the discrete summation of more than two terms of the type 2^ . The following lemma follows directly from the def i ni tions. P i/r Lemma 2.3 : The discrete summation of terms such as 2 , for i = I ,2, . . . ,m is m p. . m- I y 2 ,/r + y s. . = 2P /r 1=1 1=1 ''J +l for some p . p/r Before attacking the problem of minimizing 2 , let us examine the following lemmas. m p n n . . Lemma 2.4 : [ c 2 ' ' r may be expressed as £ c 2 J where the n. i=l j=l J are unique. This follows directly from the definition of V and the fact that (X ,+) is isomorphic to (Z,+) for q = 0,l,...,r-l . Since the n. are unique, they may be ordered such that n. < n„ < J < n n Vr n i/r Lemma 2.5: If n < n ands =2 -2 J then j k j,k k-l s = J" s. . . j> k i=j '>' + l Lemma 2.6 : If the conditions of Lemma 2.5 are satisfied, then I /r s < s < ... < s , and s. ... < 2" s ; ,, ..«, . 1,2 2,3 n"l,n i,i+l - i+l,i+2 The proof of Lemma 2.6 depends on the fact that n. < n. - I . r i — i+l Now the following theorem may be proven. P i/r Theorem 2. I : No discrete summation of 2 , i = l,2,...,m yields m p. . a smaller 2 p/r than that found by \ c 2 l/r ~| i = l c Proof: By Lemma 2.3, the discrete summation produces a minimum 2 is minimized. It can be shown from the m Ic2 p/r m-l f and only if £ s i = l i ,1 + 1 definitions and Lemma 2.4 that m p. , v i/r n n. , n- 1 I 2 l/r + Is I , l+l In other words, the discrete summation produced by the method c includes s in the slack-sum and then eliminates part or all of 1,2 s from the remaining slack; i.e., at least 2s ? is removed z , J i , z from the remaining slack. By Lemma 2.6, s. - is the smallest slack and at each recursion of ~| the smallest of the remaining slack is added to the slack-sum. On the other hand, any other dis- crete summation not of the two lowest terms can remove at most s from the system, but more than S. is added to the slack-sum, as can be shown by Lemma 2.5, Q.E.D. There is some additional notation required for division; however, it is more convenient to include it in section 2.4. An example of the \ and discrete combination methods will clarify these functions. Let r = 2 and p. = 0, p~ = 5 , P., P,, p 3 - 2 , p 4 = I , and p = I . The set {2 l/r ,2 :>/r }C(X , + ) and r 9 P 2/r 9 P 4/r 9 P 5/r, <—,., ,. u 9 P l/r ^3/r fv ,. , {2 ,2 ,2 } L-(X , + ) . Hence 2 +2 e (X , + ) and 2 Vr + 2 P4/r + 2 P5/r B ( X| .+) , or 2 0/2 + 2 2 ' 2 e (X Q , + ) and 9 5/2 J/2 _l/2 5/2 , „3/2 2 ' +2 +2 =2 +2 e(X,+) respectively. In other words 20 v 5 p i = l It 2 0/2 ♦ 2 2/2 c (X Q ,+) , 2 5/2 + 2 3/2 g (x j (2.1) Since the (X ,+) are isomorphic to (Z,+) , we may write (in binary) - 0/2 2/2 - 11:0 in place of 2 +2 e (X , + ) , where indicates that 2/2 (X ,+) is the group, and 10 represents 2 , 01 represents ; . (Note that 2 0/2 + 2 2/2 ■* 01 + 10 ■ II.) Simi larly write )f ,5/2 All , J/2 110:1 in place of 2 5/2 + 2 3/2 e (X,,+) . (Also note that + 2 + 2 100+001 +001 = 110 .) Hence, Ic2 p i/i I I :0 I I0:T (2.2) s equivalent to equation (2.1). Eva I uation of Ic.2 Pi/. proceeds in the following way: 0/2 Using the notation of equation (2.1), 2 ' is the smallest coset- group element and 2 2/2 is the next larger. Hence 2 0/2 +2 2/2 ,4/2 , by definition, and = i p i/r 2 0/2 +2 2/2 e ( x o)+) 2 5/2 +2 3/2 e (x } 2 4/2 e(X ,+) 2 5/2 +2 3/2 e(X,,+) In the notation of equation (2.2), L2 Pi/, -c = 1 I 1:0 I I0:T 00:0 I I0:T Now 2 is smallest and 2 ' is next larger. Since 2 3/2 + 2 4/2 or, l/ ]/r i = l ,6/2 2 4/2 e(X , + ) 2 5/2 +2 3/2 e(X,,+) 100:0 I I0:T 2 6/2 e(X o , + ) 2 5/2 e(X,,+) 000:0 I00:T in the notations of equations (2.1) or (2.2) respectively. Finally, ? P Ic 2 = 1 i/r 2 6/2 e(X Q ,+) 2 5/2 e(X,,+) 2 8/2 e(X Q ,+) = 2 8/2 or Ic 2 = 1 l/r 1000:0 10000:0 = = 10000:0 100: 1 0: 1 ,8/2 2.3. Tree-height Minimization of Monolithic Sums and Products In this section, we consider arithmetic expressions which are either a sum of terms or a product of factors. The productions which specify a grammar of these expressions are the following: E+ F|T , F + a|(T) |F * F , T -> FlT + T . Production F represents a product of factors, production T represents 22 a sum of terms, and terminal a is an atom from the set of distinct variable names. Furthermore, it may be written for all factors f i and terms t. J m; I t , and J-l J J 1 = 1 The f are called monolithic if factoring among the t. is not con- i j sidered and the t. are called monolithic if distribution amonq the J f j is not considered. In other words the internal characteristics of a monolithic f. or t. are not considered, i i The theorems below follow directly from the definitions and Theorem 2.1. Theorem 2.2: If E = ) t and each t. is monolithic, then i = l ] ~w h[E] = log 2 and hCEj is minimized. i = l Ic e a CtJ Theorem 2.3: If E = i f. and each f is monolithic, then f = l ' hEEH = log. 'V m 1 - i m and hCEll is minimized 23 Let us demonstrate the use of these theorems to evaluate the ■ minimum tree-height, and at the same time the syntactic tree, of the expression E=a+b+c+ def + g + h . Let w = 2 and w = 3 , Then E = £t. , where t = a, t„ = b, t, = c, t. = def, t(- = g and t, = h . All of these terms are simple except t* which is a product of factors, i.e., t. = |"Jf. , where f = d, f = e, f = f . Suppose that a memory fetch of a variable, v , requires zero time, i.e., h£v!] = ; then e[]vll = I . For each subexpression E. , let us represent e HE.!] as a binary P i/r inteqer in (X ,+) in the followinq way. Since e CE.l - 2 , then y n q a y a r include 2 ' , where n. = p. , (i.e., integer divide) in the q coset-group, where q = p. (modulus r) . Then the analysis of t yiel ds, lc^-? +1+1 :0 0: i 0:2 11:0 ' = 0:1 I 0:2 J for the 3 coset-groups, or residue classes, 0, I, and 2 respec- tively. ieY(f\ = e Cell = e [fH =1 in coset-group 0* .) The 11:0" ' m 1 - m m a v represents 2 + 2 ' in coset-group (X ,+) . Thus, ect : = Ic^fj] " 3 1 1 :0 3 = 0:1 = c 0:2 r- 00:0 0:J_ 0:2 (2 6/ V - 2 6 and e [tj = e[tj l/2 = 2 6/ ' = 1000:0 in (X„, + ) , where w = 2 . a h 4 a Then, 24 Le a Etn = l+l+l+IOOO+l+l :0 -c a 1 -' i 101 :0 for the 2 coset-groups and I . Thus, i2 eCED = lAnt,: 101:0 2 10000:0 2 (2 8/2 ) 2 . = 2 8 0:1 0:1 and h[E] = log eCEl] = 8 . Note that, in addition to finding hCEj , the order in which the lists are combined during the £ and ] operations indicate the exact way that the terms in E are combined to achieve hQG . For an additional example see Figure 2.1. The subtraction operator has the same precedence as addition in our grammar for arithmetic expressions. If it is assumed that the operator weights are the same, i.e., w = w , then Theorem 2.2 may be ^ a applied to expressions which have a mixture of terms combined by addition and subtraction. If all the operators in the expression are n subtraction, i.e., the expression may be written as E = -£t. , then n the unary minus may be eliminated by writing E' = 0-£t. . Depending upon the nature of the expression E , the tree-height of E' may or may not be greater by w , than that of E . 2.4. Tree-height Minimization of Monolithic Expressions with Division The division operator has the same precedence as multiplication; however, on most computers w , > w . If it is desired to parse an r d — m r 25 t = cd: e[t 3 D = t = e(f+g): e[f+gD ™ 3 l + l :0 10:0 0: 1 = 0:1 0:2 c 0:2 " 2 l + l :0 10:0 0:T c 0:1 = (2 3/3 ) 3 = 2 3 = (2 2/2 ) 2 = 2 2 e[t 4 ] 3 1 :0 0:0 0: 1 = 0:1 1 :2 c 10:2 (2 5/3 ) 3 . 2 5 E = 4 e[E] 1 + 1:0 0+100:1 10:0 I0:T 0:0 000:1 - (2 7/2 ) 2 - 2 7 0- I- 2- 3- 4- 5- 6- 7- E = a + b + c^d + e*(f + g) \/ \/ \ \/ Figure 2.1: Syntactic Tree of E=a+b+cd+e* (f + g) with Coset-group Analysis. 26 expression of the type E = "ff./f such that tree-height is minimized, it is necessary to select the right factor, or product of factors, in the numerator for combination with the divisor in a non-obvious way. Def i ni tion 2.6 : I f E-fr f s /f H , then e m [E] = :ce m :fi> d e m [f d : The ...+... I symbols are interpreted as follows: Recall that p n n j /w ) e Cf;H = V„ 2 where n. < n. , , . Let us simplify nota- . L c m i . L Q i i+l K i=l i=l tion by setting q. = n./w and w' = w ,/w . Then, I ) If q , < q . , then ^| ^d 2 + J 2 d q,+w; q] =2 = 2 , and e [ED = m n q q Ic 2 + c 2 i=2 2) I f qj <_ q < q , then 2 ' + , 2 d q H +w' q' = ? d d = ? I d - z ■ z , and e [Ej = m n q f q Ic2 +c 2 i=2 (If q_ does not exist, c 2 assume that q < q ? .) 3) If q < q , < q^ and q < q + I , then if q < q,+ w' then v — n a n 3 n d 2 3 2 d o q 2 q d 2 +^ 2 d 9 q d +W d 9 q 2 =2 = 2 , and m n q, q, qj I c 2 + c 2 ' + 2 i=3 (If q, does not exist assume that q, < q, . ) d 3 27 4) Otherwise, combine the two smallest terms under the q l q 2 operation (i.e., eliminate 2 and replace 2 by q 2 +i , reordering terms if necessary) and go to step I. Theorem 2.4: If E I i th ic, then = J nvf d > m h[Ej = log, and hCEU is minimized. w and the f . , f , are mono- d ■ m id e„[f ; ] +, e[f H ] i = l mi d m a Proof : It is only necessary to show that the algorithm in Definition 2.6 produces minimal e CeH . r m Case I) q, < q.: Slack-sum is minimized when f /f is first d n i Id since any other order of combination includes some slack more than once in the slack-sum (c.f. Theorem 2.1). Case 2) q : q, < q : It is also obvious that f|/f H minimizes the s lack-sum. Case 3) q„ < q < q • a) I f q„+ I < q then combininq (f * f-0 M 2 - M d 3 H 2 - M d I z yields Case 2 which is minimal, b) q < q„ + I: If q, > q + ! + w' then no matter how f , f , f, and f d are combined, hjlf * f * V f d^ = P 3 + ' • lf q + I + w' >_ q-^ _> q ? + w l then it is best to combine (f. * f ) first since h [(f, * f /f .) * f,U = I m I 2 d 3 qd + w d + 2 1 q 2 + w d + 2 = h m D q, (and hence q , + w' >_ q-J then 28 hjcccf, »f 2 )/f d ) *^ = q 2 + w^2>h m Cf | * (f 2 /f d ) *f 3 n fq 3 + 2 if q 3 > q + w« - I q , + w' + | if q < q + w' - I Q.E.D. Figure 2.2 demonstrates the use of Definition 2.6 for an expres- 6 sion E = jf./f for two cases: a) where 1*0.1] = , lOoll = h[f H = I , h[f 4 J = h[f 5 ] = 2, h[f 6 U = 15 , and h^] = 10 ; and b) where hO^D = 17 instead of 15 and otherwise the same as case a). 6 These two syntactic trees of E are in fact of minimal height, as the reader may wish to verify. 2.5. Pi stribution In this section we consider expressions which are of the types (£+.) * f > /f , and (If.) * ($V.) . Lemmas are given which show how to determine when tree-height is reduced by distribution of a factor over a sum of terms. These lead to algorithms which reduce tree-height for expressions of the types jf. , £t. , and (TTf- ) /("[Tf ' ) when the factors and terms are not monolithic, i.e., i , J each f . = Tt Consider E = (a + bcd)(e + f) and the partially distributed form E = a(e + f) + bcd(e + f) . Then syntactic trees for E and E with w = 2 and w^ = 3 are: a 3 29 a) E' - 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10- I I- 12- 13- 14- 15- 16- 17- 18- 19- 20- 21- 22- 23- ((f * (f * f )) / I 2 3 ) * ((f * f c ) * f 4 5 ( b) E = - 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10- I I- 12- 13- 14--. 15- 16- 17- 18- 19- 20— • 21 22- 23- 24- (((f * (f * f )) * (f. * f )) / f J ) * 12 3 4 5 d 30 E = (a + b * c * d) * (e + f) , E=a*(e+f)+b*c*d*(e+f) - 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10- [ \l I V \ \ o v / \ \/ o + / \/ i y \ v \ In this case, E = f * f_ , and Y\\jT\ - I I > whereas E = t + t^ and hL~E H = 10 . The difference in tree-height is due to distribution of f across f , = t, , + t._ . The o's in the syntactic trees above i n- 2 I i i 1 2 dicate the presence of holes* where larger subtrees may be accommodated. Tree-holes are in one-to-one correspondence with the slack found during the use of the \ "] algorithm. Lemma 2.7: Let E p p+l = Tf , E = f * (E) and E = Tf. wh i 2 I ere f = f and all factors be monolithic. If p+l P+ 1 ^c mi < 2 * 'c m i then there exists a hole in E to accommodate f , and h[E ] < hCE.H For example, suppose E = (a + b)cd , E = ((a + b)cd)e and E = (a + b)cde . Since E has a hole which can accommodate e then hCEjD > hCE^ , viz : • c.f. Muraoka (28), chapter 3 31 E = ((a + b) * c * d) * e and E = (a + b) * c * d * e I 2 0- 1- 2- 3- \/ / 4- \ / 5- \/ r 6- \ 7- > \ 8- \ 9- where v» f = 2 and a w m / 3fc_ / o 3 . Now consider any expression E which consists of arithmetic combinations of variables v., v_,...,v . In the syntactic tree of I 2 n E , there is at least one path from the root at level h[_E~\ to level zero which contains no slack. Furthermore, if E consists of a set of monolithic subexpressions E,,E n ,...,E , there is a path from r I 2 n level h !ZeZ1 to some E. at level hCE • H which contains no slack. Def i ni tion 2.7 : Subexpression E. of E is said to be domi nant in E if there is no slack in the syntactic tree on a path from the root of the tree of E to the root of the tree of E. . P P Suppose E = ^tj or E = |f. , where the terms or factors, re- spectively are monolithic. Previous analysis has shown that p n n -/ w a )\e = Y 2 where e . represents eTt-J or e [f ] re- n. /w spectively. Let e CE.I1 = 2 a ; i.e., subexpression E. corres- ponds to the sum of terms, or product of factors, whose effective n./w a- ength is 2 32 Lemma 2.8: Let E = £e. or E = ~[~E. , where the E. are monolithic and defined as above. Subexpression E. is dominant in E if J n k /w a n. = max({n. 2 e (X ,+)}) such that Kce ; : e (X ,+) q Furthermore, if there exists an E, such that n. = n . - w , then k k j a ' E. is also dominant in E . Call these the major-dominant subexpres- sions of E . 3 3 For example, consider E = £E. and E' = ~|~E. , and let h[E 3 = 2 , h[E J = 3 , h[E 1 = 5 , w_ = 2 and w = 3 . Then the I Z 3 a m syntactic trees of E and E' are: E - E, + E 2 + E 3 E' - E, • E 2 . E 3 I — 2- 3— 4- 5- 6- 7- 8- 9- Note that E„ and E are major-dominant in E and only E is dominant in E' . The coset-group analysis is: elE] a We 1 3 10:0 I0:T 0:0 1 000:1 >7/2 and e [E'] = m RCE,] 10:0 = 0:1 = . c 1 1 :2 c 100:0 1000:0 0:1 = 0:1 10:2 r 0:2 _ o9/3 33 respectively. Since e a L~E_l e X. , then 110:1 indicates that E and E-, are the major dominant subexpressions; and since e TE'I e X 3 mo then 10:0 indicates E^ is the major-dominant subexpression. P | P2 P2 Lemma 2.9: If E = tt, +' ( Yt!) * f , then if ( Tt f .) * f is not P2 dominant in E , then distribution of f across Yt' does not reduce L J tree-height. Now the tools have been developed to determine when distribution of an expression such as E = ( ) t.) * f reduces tree-heiqht when N 1 w >_ w . Let J = {i|t. e {E.} or t. e {E.,E^} , i.e., the set of Pi major-dominant subexpressions of E} . If each t. = Tf . . has a hole j=l J to accommodate the factor f , then a substitution of E^ for E is performed where E d = I t. * f + ( I t.) * f . (2.3) ieJ ' i^J The second term of E is also in the form of E , and if it is dom- inant in E , then it is desirable to perform d i stri bution on this term also, by applying the distribution algorithm recursively, termin- ating when some term is found without a hole. The general form of E is E d = y t. * f + y t. * f +...+ y t; ^ f + ( I tj * f . (2.4) 1 \i U J . d The tree-height of E is 34 h[E°] = log. ej Icejf: J * eJf] ^V m- i ,j J-l w /w m a + c Ic ieJ .^ f i,j^c e m M w m /w a + c L ieJ e m [f. J +„ eSfl m | i c m m' a Ic e TtJ n a ' M u J. J J-l w /w a 7 m + c e mM W m / W o m a If E = (a + bcd)(e + f) , (of., p. 28 ) f the term bed is dom- inant in (a + bed) and also has a hole to accommodate the factor (e + f) . Thus, E d = bcd(c + f) + (a)(e + f) . The term (a)(e + f) is not dominant in E (and (a) has no holes) so no additional distri- bution wi I I further reduce tree-height. Sometimes, however, even though (£tl ) * f is dominant in E = £t. + (£t r .) * f and there is some dominant term t. in £t- which has no hole to accommodate f , distribution of f across £t. reduces tree-height anyway. For example, let E = a + be + (de + f)g The term (de + f)g is dominant in E and de is dominant In (de + f) , 35 but has no hole to accommodate g . Even so, E = a + be + deg + fg has a lower tree-height than E , viz: E=a+b*c+(d*e+f)*g , E=a + b*c + f*g + d*e*g, - 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10- \ \l \l I < o / * o / \/ \/ / + + o V \ \l \l \l I i y v where w = 2 and w = 3 . On the other hand, if a m E = a + be + (de + fg)h then this phenomenon does not occur when h is distributed across (de + fg) and the tree-height is 10 in either case. Furthermore, if E = ab + cd + (ef + g)h then hCEj > h^E ~\ , where E = ab + cd + efg + gh , viz: E=a*b+c*d+(e*f+g)*h,E=a*b+c*d+g*h+e*f*h - 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10- V \ O \ V / + o \/ \/ \/ \/ \/ I * * * Mr / / o / The reason for distributing in this case is quite different from hole distribution. The dominant term in E is distributed in order to 36 split it apart into several smaller terms so that the resultant "J- syntactic tree may become less unbalanced. Let E = 2,+ . + E' * f , P2 where E' * f is dominant in E and E' = > tj = ][e' , i.e., a sum 1 i of subexpressions E! which are defined by the bits in the coset- P2 n qroups in correspondence with ) e Et.'H = I e L~E!3 • Let J be the L c a i t c a 1 set of major dominant subexpressions in E' such that some term t' e J does not have a hole to accommodate f , and hence J is the i set of remaining subexpressions of E' not in J . Lemma 2.10: Given the above conditions for E , let P| I t. + £_E! * f + < J E!> * f . = 1 i eJ i eJ f e [E u ] a H l Ice a nt,:+ C Lc = 1 ieJ e [E!]+ e [f] m i c m u w /w m a ic^mi i eJ a^ i w /w w a 7 m + e m c m w„/w nr a P| I ce a rt ; > ( i = l lc%^J w a/ w m + e CfD c m m a = e [E] , a then h[E d ] < h[E] . 37 This lemma is obviously true. Basically, a check is made to see if the effective add-length of E is less than the effective add- length of E . If it is, then h[E d J < h[E] . P, P 2 Since equation (2.4) is of the form E = j t. + C J t!) * f , i=l ' i=l ' the final step in the distribution algorithm is to check for a reduc- tion in effective length by decomposition in the following way: If ( ) tj) * f is dominant in E , then if e [E ] < e EEL] , where 1=1 ' a a H P| E = I f - + L E ! * f + ( X E!) * f (2.5) 1 = 1 ' ieJ ' ieJ ' then equation (2.5) is the desired form of E . Otherwise equation (2.4) is the desired form of E . By application of lemmas 2.7, 2.9, and 2.10, one can easily show that any further distribution of a monolithic factor f across sum- mation of terms, which are products of monolithic factors, does not reduce tree-height and thus hEE 2 ' s minimized with respect to mono- lithic factors. In summary, the Multiplication Distribution Algorithm is the recursive application of Lemmas 2.7 and 2.9, followed by the application of Lemma 2.10. As an example of the above distribution algorithm, consider E = (a + b + (cde) + f + q)h)i where hCEj = 16 when w = 2 and 3 a w = 3 , viz : 38 E = - 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10- I I- 12- 13- 14- 15- 16- (a+b+(c*d*e+f+g)*h)*i \/ \/ / " ' ' \/° / Ct, + t- + t ) * f 1 2 3 | \ \ / ? \ / \ \ / / Since t^ is dominant in t + t + t , and cde is dominant in 3 | 2 3 d t which contains a hole to accommodate h , E is found viz : E ' =(a+b+(f+g)*h+c*d*e*h)* - 'WW M f 2 - 3 - 4 - 5 - 6 - 7 - 8 -• 9 - 10- I 1- 12- V V J \/ V I 7 / Now t is dominant in t + t- + +_ + "K and t, contains a hole to 3 I 2 3 4 3 d ? accommodate f . Hence E c is found, viz: I 39 2 _ (f+g)*h*i+(a+b+c*d*e*h)*i=t +(t'+tl+t!)*f | I 2 3 | - 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10- I I- 12- 13- V V V s o \ \ \ V / \ Now (t f + ti + t') * f is dominant in E , and t' is dominant I 2 3 I '3 in t' + ti + ti but t4 contains no hole to accommodate f. . Never- | 2 3 3 I d 3 d 2 d 3 theless, e [E ~\ < e [_E J and E is parsed in the following way: a a - 1 - 2 - 3 - 4 - 5 - 6 - 7 - 10- i I- E 3 = (f + g)*h*i + (a + b)*i + (c*d*e*h)*i = t, + E' *f , + (E') * f. a | I I 2 I v v y i v i/ \/ * * \/ / / \ In this case (E*) * f is still dominant in E , but no further de- composition is possible, so h[E H is distri buti ona I ly minimal. 40 We now consider expressions of the form E = (~~t.)/f and de- termine when distribution of f over £t. , using division reduces tree-height. The methodology is essentially the same as distribution using multiplication. Lemma 2.11: Let E = \T f , , E = (E)/f and E = f *f *. . ,*f /f , | = | I I I 2 12 p all factors be monolithic, and w > w . If d — m ; r e L~f.II +, e [f] 'Q m i dm < 2 w ,/w d m Ic e CfJ _ i m i i = l then there exists a divide-hole in E to accommodate f , and hL~E U > h[E ~] . p | P 2 P2 Lemma 2. 12 : If E = j t. + ( " t|)/f , then if ( J t!)/f is not i=l ' i=l ' (=1 ' dominant in E , then distribution of f across P 2 J" t! does not reduce tree-height. i = l ' A Division Distribution Algorithm is essentially the same as the multiplication distribution algorithm, if Lemmas 2.11 and 2.12 are applied in place of Lemmas 2.7 and 2.9 respectively. n m Suppose there is an expression such as E = ( "• t.)*( ]" t') . i=l ' i=| J Under what conditions does distribution reduce tree-height and what is the maximum possible reduction? The following theorem answers these questions . i 41 Theorem 2.5: Let E = ( Y t.)*( Y +'> and E = t.t! + t t' + ... i=| ' j=l J I I I 2 +t f + ... + t t' . Let n > m . If m > 2 then h[E d ] > h[E] I m n I — — — and if m = I then h[E H > h[EU - w — a n m Proof: Let E = A * B , where A = Y t. and B = Y t f . Without 1 = 1 ' J = l J loss of generality, assume that h[7\3 > h[j3j . By definition, e a CE°] . n m Ic L e [t. * t'J Since eSf- * t'J > e [t.] , a i j - a i then e [E ] > m a — ^c a i = me TaH . Since, by assumption, A is a the larger factor of E , then e [Aj - 1/2 e CeD . Hence a a e[E]>!e[E] which imp I ies that h[E d D >_ h[E] + w ( log m- I ) . 2 a a ,lws 2 Q.E.D. Theorem 2.5 verifies that we need only consider distribution to reduce n m tree-height of expressions such as E = (£t.) * (£t T .) when either n m J n (£t.) or (£t'J is monolithic, i.e., only when E = (£t.) * f . n Let us reconsider expressions of the form E = ~J~f. . Suppose the factors, f. are not monolithic, i.e. suppose "I "2 E = ( I t, .) * ( I t ) * * ( £ t .) . Is it possible to reduce :_l m>J tree-height of E by distribution? By applying the principles of Lemma 2.9 and Theorem 2.5, it is. 42 Lemma 2.13 : Let E = [f. , where the factors, f , are not mono- i = l ' ' lithic. The tree-height of E is reducible through distribution if and only if there is exactly one dominant factor in E . n Let f, be the only dominant factor in E = Tf . Then we \ may write E = ( £ t. ,) * |fi . Distribution was previously inves- tigated for expressions of the form (£t.) * f . In this case, f is replaced by a product of factors. The following lemma is a generali- zation of Lemma 2.7. P, P, p 2 Lemma 2. 14: Let E = ]Jf . , E = ( J]f ) * ]J f. and i = l i=p+l P|+P 2 E = f . , and a I I factors be mono I i th i c . If 2 1 = 1 ' P,+P 2 i=? m ' < 2 I e Lf.J u c m i then there exist holes in E to accommodate Tf. and i h[E J < h[El . 2 I Hence, distribution is performed exactly as before with E = ! + ki * ^ f i + ( I t k .) - TTf , jeJ K ' J i^k ' j^J k ' J i^k ' n k. where J is the set of major-dominant terms in > t, . 43 The Distribution Algorithm for E = ~]~f . (non-monolithic f) is the application of Lemmas 2.13 and 2.14 followed by the application of the mu I ti p I ication distribution algorithm. 3 For example, suppose E = (a + bc(d + e)(f + gh))i(j + k) = ]~ff i The only dominant factor in E is f , and bc(d + e)(f + gh) is the dominant term in f , as may be seen in the following illustra- tion: E=(a+b*c* (d+e) * (f + g*h)) * i * (j + k) = f * f 2 * f 3 . - : \ V )' \ I y \ \ / \/ 3 - o * o 4 - 5 - o \ / + 6 - \ *. o 7 - o 8- \ 9 - o. 10- II- 12- 13- 14- Since holes exist in the dominant term of f to accommodate i*(j+k) , we may distribute and form E = bc(d+e) (f+gh) i ( j+k)+(a) i ( j+k) , and the syntactic tree is: 44 E = b * c * i * (d+e) * (j+k) * (f+g*h) + a * i * (j+k) = t. + t. - 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10- I I- 12- 13- v I i' v iu v y Similarly, let us reconsider expressions of the form E = £ t. , 1 = 1 where the terms, t. , are not monolithic, i.e., suppose z m E = T f + TT f . + ...+ TT f • Let J be the set of major j=| l,J j=| 2, j j =| m,j dominant subexpressions in E . Then E may be written as i^J ieJ j=l ' 'J For each i e J , t. i s of the form t = ]T"f . . If the tree-heights of all such t. are reducible by distribution using the above algorithm, then so is the tree-height of E . P l P 2 Finally, we consider expressions of the form E = ( \\f . )/( |p f '.) . All of the tools have been developed to evaluate E such that the tree-height of E is reduced. The basis of the following algorithm is to balance the numerator expression with the denominator expression. 45 Extended Division Algorithm n m Let Q = N/D where N = Jf. and D = Jf ! , and let i=l ' i=l ' e [f J < e [fj < ... < e Cf 1 and e m [f] <... < e Cf'H during all mlm^ m n m l mm iterations of the algorithm. We assume that w > w . Furthermore, d — m we modify the division algorithm* in the following way: whenever the division algorithm selects f. to form the quotient f./f such that J J E = Tf./ f becomes E' = (f./f) * Tf. and e [f 3 < e [f ] and J tir I m m J 1 = 1 »*J (f./f) is the only dominant factor in E' , then distribute f J n . across f = ) t. if tree-height is reduced. (c.f. Definition 2.6) J k=l J> k I) If m = I , then use the modified division algorithm to evaluate n q = TTf./f • 2) If I eCfp i=l m ' I e Cfi: i=2 , then combine f f * f , I 2 ' using distribution if f' is dominant in D, to a single factor, thus producing f ",f ~, . . . ,f ", , in accordance with the £ definition. Hence m' < m and let e Cf "H < e m L! f 2^ < •••< e m CfJtH. Set m = m' , and go to step I . 3) If e CN/f'J < e [D3 , then replace N by N/f! , using the modi- m | m i fied division algorithm, D by D/f! , and set m = m- I , and relabel all f as f! . Go to step I. i i-l K 4) Otherwise, combine D to a single factor f T , using distribution, * C.f. Definition 2.6, 46 n where possible, for expressions of the form ~ff j , set m = I i = l ' and go to step I . Intuitively, the above algorithm iteratively selects factors f in the denominator which will lower the tree-height of the denominator when removed from D and raise the tree-height of the numerator when included in N as N/f . This process terminates when the numerator and denominator are balanced, within certain bounds, whereupon the modified division algorithm is used to f i nd Q . Note that passing the condition e CN/f H < e CdJ guarantees that tree-height is reduced m i m by at least one and at most w units of time. m For example, consider the expression E = a(b+cd) /(e(f+g) (h+i jk) ) The syntactic tree of E with w = 2 , w = 3 , and w j = 5 is: am d E = (a * (b+c*d))/(e * (f + g) * (h+i*j*k)) = f ] * f /(f j*f »*f • ) Since e CD/fll = e CDl] , e * (f + q) is combined into one factor, m I m 47 and E becomes E = f * f /(fj * f) . During the next iteration 12 12 ejD/f'J < e CD] , but elf* f ? /fH = 2 I3/3 > elM = 2 II/3 . m | m m I ^ I — m Hence the tree is balanced as much as possible and D is combined to a single factor, with no distribution, and thus no tree-height reduc- tion is possible. On the other hand, consider E = a(b + cd)/(e(f + gh)(i + jk)) . Then the syntactic tree is: E = (a * (b+c*d))/(e * ((f+g*h) * (i+j*k))) = f * f /(f!*f') . - 1 - 2 - 3 - 4 -- 5 - 6 -- 7 - 8 - 9 - 10- I I — 12- 13- 14- 15- 16- In this case e [D/f J < e [D] , and e Lf.*fjf\l < e [D] , so E mi m m I 2 m becomes (a(b+cd)/e)/( (f+gh) ( i+jk) ) and the syntactic tree is: 48 E = ((a/e) * (b+c*d))/((f+g*h)(i+j*k)) = f * f^/f J I 2 | - 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 — 10- I I- 12- 13- V rV?V o * o ? o * \/ \/ Clearly hCEU = 13 is minimal in this case. 2.6. Cone I us ion Any arithmetic expression except a continued fraction may be trans- formed to one of the forms presented in this chapter. The algorithms given either reduce or minimize the syntactic tree-height. Expres- sions where functions appear in place of variable names are easily handled by giving a weight, which may be fixed or a function of the number or type of parameters, to the function. A PL/I program, which implements many of the algorithms presented in this chapter, appears in the Appendix. The program accepts an expression as input in the array EXP and produces MHT, the reduced tree-height of the expression, as output. The format of an expression such as E = a + (b-(c+d)e)f in EXP is: 49 19 1 2 3 4 5 6 • • • • 159 + a + -1 * f > + b - -2 * e * + c + d * The negative entries point to the row containing the appropriate fac- tor. Row zero always contains the expression E , while other rows contain subexpressions. Even though the program uses recursive procedures extensively, the computer time on an IBM 360/75 required to analyze most statements found in FORTRAN programs is on the order of one-half second. 50 3. TREE-HEIGHT REDUCTION FOR A SEQUENCE OF MATRIX PRODUCTS 3. I . I ntroduct ion In this chapter the time required to form the product of a se- quence of conformable matrices is investigated. Unlike extended scalar products where the commutative and associative laws of arith- metic may be applied to reduce syntactic tree-height (c.f. Chapter 2), only the associative law may be applied to a sequence of matrix pro- ducts. Sometimes, however, certain types of matrix products reduce to a scalar in which case the commutative law may be used as well (i.e., for a scalar z and a matrix A , zA = Az) . Muraoka and Kuck (29) have found a method to recognize when a sequence of matrix products contains subexpressions which reduce to a scalar. Furthermore, where the multiply operation of scalars requires a fixed amount of time, w , the multiply operation of matrices is a function of the matrix dimensions. If a system of parallel processors is used, then the matrix product A A , where A is dimension a x a. and A is dimension a x a , may be performed in t = w + w m a loq a y 2 I time, or units, where w is the multiply m weight and., w is the add weight. Under ideal conditions, the a^a.a^ multiplies may be performed simultaneously in time w , and I 2 K ' K m the a n a 9 elements of the product matrix may be found simultaneously in time w og 2 a 51 An algorithm is given for association of a sequence of matrix products such that the syntactic tree-height is minimized, and hence the execution time on a system of parallel processors is also minimized. Furthermore, if two choices of association give the same tree-height, but one results in fewer computer operations than the other, then it is preferred. 3.2. Scalar Matrix Product Subexpressions and Canonical Form This section summarizes the pertinent work of Muraoka and Kuck m (29). A matrix product expression E = A. is considered where i = l ' the matrix sequence is conformable and each A. is either an n x n matrix, an I x n matrix (i.e., a row vector) which is replaced by R. , an n x I matrix (i.e., a column vector) which is replaced by C. , or an I x I matrix (i.e., a scalar) which is replaced by z. . The syntax for E is defined by the set of productions shown in Table 3.1. Included in Table 3.1 are the transformation product weights as well as the number of multiply and add operations using a system of parallel processors. Matrix product expressions are regular, i.e., the grammar is reg- ular. The only well-formed instances of these expressions are of the forms 52 Number of Number of Production Para 1 lei Weight Mu 1 tipl ies Adds E «■ A | R|C | S A <- CR| w m + w | a 1 a 2 ' 2 n aa| w m + w a i°g 2 n 1 n 3 3 2 n - n zA|Az w m n 2 R + RA| w m + w ' a log 2 n | n2 2 n - n SR|Rz w m n C +■ AC| w m + w a log 2 n | n 2 2 n - n zC|CS w m n S «- RC| w m + w a ' log n | n n - 1 zS|Sz| w 1 Table 3.1: Productions and Weights for a Matrix Expression 53 (3.1) E = A*C(RA*C)*RA , E = A*C(RA*C)* , E = (RA*C)*RA* , E = AA* , or E = (RA*C)* , where * is the Kleene star, which means that zero or more instances of elements of the previous type occur. For example A* = (X U A U AA U . . . ) , where X is the empty symbol. An expression E is first scanned for instances of sub-expres- sions which are of type RA*C . It is obvious from the fact that S «- RA*C (use the production R «- RA and S + RC) , and that the scalar RA*C may commute with any element in the expression, that one may write E' = [(RA*C)*HA*CRA* , E' = :(RA*C)*I|A*C , E' = (RA*C)*RA* , or E' = [(RA*C)*:]A* . E' is said to be the canonical form of E . The square brackets in equation (3.2) indicate that the contents between the brackets must be evaluated separate from the rest, otherwise E' is not well-formed. (This is obvious since neither the production CA nor CC occur in Table 3. I.) For example, suppose E = A A C R A C R C R Q A . Since both R.ArC, and R..CL reduce to a scalar, then 4 5 6 7 8 (3.2) 54 E' = [(R 4 A C 6 )(R 7 Cg)]A A^A . The algorithm to find an instance of RA*C in an expression E is obvious. 3.3. Conditions for Tree-height Minimization Muraoka and Kuck give an algorithm to find the rules of association of matrix products in canonical form where the matrices are either n x n (square) matrices or vectors of length n (rows or columns) such that a balanced syntactic tree results. This section gives the rules of m association for matrix product expressions such as E = [Aj , where i = l each matrix A. is of dimension a. . x a. , such that syntactic i i -I i tree-height is minimized. We first obtain the canonical form of E and proceed by treating each instance of matrix sub-products of the form RA*C separately from the remaining product sequence. Since A*CRA* , A*C , and RA* are generically all of type A* , where A is not necessarily square, for the purpose of discussion, we consider the canonical form of all expressions to be of the form E' = [(RA*C)*HA* . We then show how to build balanced syntactic trees, by presenting several lemmas, such that tree-height is minimized. These lemmas are based on the technique used in Chapter 2 where it was shown that the discrete combination method minimized tree-height (for monolithic subexpressions) by minimi- zing slack, (c.f. Theorem 2.1.) Also, if either of two parsing options produce the same tree-height, then it is shown how to select the one such that fewer total computer operations (adds and multiplies) are requi red. 55 Def i ni tion 3.1 : Let n "| = 2 , where p is the smallest integer such that n <_ 2^ . (c.f. Definition 2.4 of "| . In this case, there is only one coset-group and hence, the subscript I.) Def i ni tion 3.2 : Let w be the weight of transformation T , where T is either some matrix A. from the matrix product expression, E = A. , or T is a product, or composed, of two matrices (T.T.,,) . The value of w is given by the following obvious I emma, Lemma 3.1: If T = T.T... , where T. is a t. . x t. matrix and i i+l i i-l i T. , . isa t. x t. matrix, then w T = w_ + w_ log f t. 1 , . If i + l i i + | T m a a 2' i'l T = A , then w = . i T Def i ni tion 3.3 : Let hDG be the minimum syntactic tree-height of the transformation T . The value of hD~U is given by the following obvious lemma. Lemma 3.2: If T = T.T , then h[T] = max(h[j J,hCT- .J) + w T . If i i+l i i + i T = A. , then h[Tj = . Now that the notation has been presented, we return to the prob- lem of evaluating the product of a matrix sequence, which is in canon- m ical form, i.e., we wish to calculate the matrix E = ~|~A. . This i = l ' must be performed by iteratively calculating transformations, i.e., reducing two matrices to one, until only one matrix remains. 56 Furthermore, a transformation T. may only be paired with its neighbor to the left or right, i.e., T. . or T. , respectively. Lemma 3.2 k gives the value of hCEj when E = T.T^ or E = T. . When E = T. 12 I ill i = l and k >_ 3 , the association rules, or pairing strategy, to produce hL~E3 are not immediately obvious. However, Lemma 3.2 shows that the formation of matrix products is still a discrete combination process, comparable to the process de- scribed in Chapter 2. It will be shown that, similar to Chapter 2, a combination strategy which minimizes slack for matrix product expres- sions also minimizes syntactic tree-height. Let us apply the definitions of Section 2.2 to the matrix product problem. Some simple algebraic manipulation may be used to prove the f o I lowi ng lemma. Lemma 3.3 : If E = T T and h\J \1 < h[T 1 , then w w eCE:= f t 1 a m Reca I I that s. is the s I ack between the subtrees for T and • > z - T . The significance of the expression (eJOT ,3 + e LTrJ + s. ) = 2 a K nv- | m^ 2 1 ,2 2e D" 9 H may be seen immediately from the following illustration: 57 e [Tl m s m- m | 1 ,2 m 2 Note that in Chapter 2, for a scalar expression E = f *f of monolithic factors where hCf.U <_ hCf oil , it was shown that w e[EU = (e Ef I Zl + e CfoJ + s ) , which differs only by a coefficient from the corresponding expression for a matrix product. The following lemma is obvious. Lemma 5.4: t | > I . Let us examine this coefficient more carefully. It is a function of the common dimension of every pair of matrices in the expression k E = 1~T. ; i.e., for every pair of matrices T.T. . (i = l,2,...,k-l) t. is the common dimension. No matter what transformation pairing strategy is used to evaluate E , every transformation weight r i w a + I og„ | t. , must appear somewhere in the syntactic w T = w Vi + I m '2' i '| 58 r_ w t a i n Lemma 3.3 is rea I I v I | unimportant if we wish to show that eCEj is minimized. This fact is forcefully demonstrated in the following analysis, where it is shown that eCEU is minimized only when the correct pairing strategy is used. Recall that e \J ,1 < e m [T 1 in E = T.T„ . Since T n has the m I — m 2 12 2 higher tree-height, and since T is a product of two matrices, which we ca I I Toi and T~„, let us spj_i_t T into its components, i.e., T„ = T .T 00 , and let us assume that e D" ol H < e D" 00 H . Then 2 21 22' m 21 — m 22 E = T.T-.T-- and we may write m "iw a /w £e m CT,D+ 21 w /w a m (e m [T 2| >e m [T 22 >s |2 ^ 2 ) + s^ 2 } , (3.3) if and only if e \JT ,1 > e D" H . (In other words, the proper pairing rrr I - * — "m^'22 strategy is, in fact, T (T T ) rather than (T.T 2 ,)T 22 .) in the proof by contradiction, we suppose e L~T.H < e L~T„,-J and then show ' rr m I m 22 e m L~EU given by equation (3.3) is not minimal. Let r iw /w a m and c„ . = t„, , . Then we claim that for I t w /w a' m E ' = (T , T 2, )T 22 e m :E ' ] = ^l { ^22p^l (e m^i : HP , 2i :j#9 l.2l ) *22J2| } (3.4) where T._, = T.T , , has a smaller value than e „TeD qiven by 121 I 21 ™ a 59 equation (3.3). Let us replace the slack variables by their defini- tions, expand and rearrange terms on equations (3.3) and (3.4) and write, and e CEj = c.e CT H + 2c,c (e CT 9 J) , m \ m z i z I m zz (3.5) e m [Ef: " C 2. e m CT |2l >2c l C 2. e ETiH , or ml' (3.6) respectively. Now we may compare these equations term-by-term. Since e [T 1 <_ e CT 9? H by assumption, and eLX.D < e [T ] by supposition, m 1 - 21 m 1 - 22- and since e„TT.-J = 2c„.e D~ 00 H by definition, then m 1 - 2 21 m 22 7 e [T 9 J > c 01 e ET.^,1] . Finally since c, > I , the claim is obviously m Z — ' 2 1 m 121 I — true. One could similarly show that e CE'Il < e [EH implies that ' m m r e DM1 < e DMH . Thus the followinq theorem has been proven. m I -1 m 22 3 K Theorem 3.1: Let E = T,T T and assume that h[T,], hp~ [], and 12 3 I 2 h[T_] are in fact the minimum transformation tree-heights and let h[T,] >_ h[T 3 and h[T J < h[T ] . Then hDjT^D is achievable if and only if the pairing strategy which minimizes slack is used. When E = ~|~T and k > 3 , by recursive application of Theorem 1 = 1 ' 3.1 in a top-down fashion, we find that slack is minimized only when the pair T.T. is selected for a transformation association where the i i + l 60 composite transformations have tree-heights smaller than any other pair. That is, if h. = h[Tj + h[T. ±l ] , then J J j+l h. = min{h.|j = I ,2, . . . ,k- I } . Before attacking the remaining analysis for cases where k > 3 , let us first complete the case for k = 3 . There are two cases to investigate. If h[J 1 is larger than h[T ] and h[J_] , then we may wish to investigate whether or not to split T„ into two parts, T„. and T , so that T.T and T22T, may be formed. Secondly, if all tree-heights are the same, then even though hp", (T~TO[] = hL~(T,T„)T,,[] , one may be preferred over the other because fewer total computer operations (adds and multiplies) may result. The proof of the following lemma is similar to the proof of Theorem 3.1. Lemma 3.5 : Let E = TTT . If w T T +h[T_]+w T T > w +max(h[T.H+w ,h[T ]+w ) , then split T into 12 l 2 3 '2 ' V2 2*3 l its components T~ . and T~~ and hCT.T^T^U = hC(T.T 7 , ) (T«^T,)I] . Lemma 3.6 : Let E = T "T T and hCTjH - h[J 2 ] = h[T 3 H . If t (w t„+w t.("h.-t,)) < t,(w t.+w t n (t.-t )) , then E = (T,T )T, gives 2a0ml03 3almOI2 ' I 2 3 a fewer computer operations than E' = T.(T T ) . The proof of Lemma 3.6 is easily shown by the fact that E requires t t 2 (+ l ++ 3 ) multi P lies and t (t 2 +t 3 )+t|+t 2 -2 adds and that E' re- quires t.t,(t n +t 7 ) multiplies and t-,(t~+t . )+t„+t .-2 adds. 61 k Finally, two more questions when k > 3 , E = "|~T. , are resolved. Firstly, if a string of transformations have the same tree-height, and each pair is a candidate for association, how should the correct pair be selected? Secondly, since the transformations T, and T can 1 k only be associated with J ? and T , respectively, should these pairings be given special consideration? The first question is answered by the following lemma. Lemma 3.7: Let h . = h[T ] + h[T ] , for j = I ,2, . . . ,k- I , and J J j+l J = {j|h. = mi n(h ,h~, . . . ,h .)} . If |j| = 2 , then apply Lemma 3.6 J to select either the smaller or the larqer p e J and pair T T P p+l If |j| > 2 then select the smallest p, e J and the largest p e J and make both associations T T , and T T , . P| P|+l P 2 p 2 +l Clearly, this strategy ensures that the largest number of pairings is achievable at that level, that the syntactic tree is balanced on both ends, and that slack will be minimized. Note that the applica- 9 tion of Lemma 3.7 on E = "]~A. , for example, gives us E = (A.A„)A A A A A^(A n A n ) = T ,T T^T .T^T^T^ . Now we answer the second 12 34567 89 234567 question k Lemma 3.8 : Let k + 3 and E = "Jt. . If h[T ~\ <_ h[T ] , then assoc- iate T,T . Similarly, associate T, ,T, if hD", ~\ > h[X U . 12 k- 1 k k- 1 — k Clearly, since T. can only be paired with T , the height of T, is fixed; whereas the height of T may become larger if T is 62 paired with T, . In other words, unless T.T ? are paired, s. can only grow in size. By Theorem 3.1, associating T T leads to minimum tree-height of E . m Thus, all the conditions necessary to parse E = A- such that syntactic tree-height is minimized have been established. 3.4. The Sequence of Matrix Products Parsing Algorithm In this section, an algorithm to parse a sequence of matrix pro- m ducts in a bottom-up fashion, such as given by the expression E = ~]"a. such that syntactic tree-height is minimized, is presented. The algo- rithm summarizes the results derived in Section 3.3. k m 1) Let E = T. represent the expression E = A. (i.e., i=l ' 1=1 T. = A. and k = m) , and let w T be the weight of transfor- ii T. 3 i mation T. . i 2) If k = I , then STOP. 3) If k = 3 , then 3.1) If h[T ] = h[T 2 J = h[T] , then associate either (T T ) or (T T ) depending upon which leads to the smaller amount of processor time (c.f. Lemma 3.6), set k = k- I and let k E = T. , with appropriate relabeling of the T. . Go to i = l ' ' Step 4. 3.2) If w + h[T 1 + w T T > w T + max (h[T J + w , 1 I ] 2 2 2 3 2 ' I l 2 hp",H + w ) , then split T into T_. and T , 63 associate (T T ) and (T T ) such that E = I 21 22 3 k (T T )(T T ); set k = k - I , and let E = TTt. with I 21 22 3 ' ' I appropriate relabeling of the T. . 4) If h[T.H <_ hD" 2 H , then associate M = (T.T„) , replace the two transformations T and T in the matrix sequence, with M , k set k = k- I and relabel the T. such that E = TY . Go to i ' ' i step 2. 5) If hCT._,H >_ h[T,H , then associate M = (T T ) , replace the two transformations T, and T, in the matrix sequence, k- 1 k *i » with M , and set k = k- 1 . Go to step 2. 6) Calculate h. = h[T.I! + h[T\ J for i = l,2,...,k-l , the set i i i + 1 J = {ilh. = mi n(h . ,h , . . . ,h. .)} , and determine the smallest 1 j 12 k- 1 p, e J and the largest p e J . 6.1) If |j| = I , then associate M = T T , eliminate T , 11 P p+l p+l replace t . with M in the matrix product sequence, set k k = k-l and let E = 7T T - Go + ° s+e P 3 - i = l ' 6.2) If |j| = 2 , and p ? - p,+l , then apply Lemma 3.6 to sel- ect the appropriate transformation (i.e., either T T or T T , . ) for association as in step 6.1. P, P,+l P 2 P 2 +l 6.3) Otherwise, associate M, = (T T ,.) and M„ = (T T^ ,.) I P, P|+l 2 p 2 P 2 +l replace the two transformations T and T , , in the P. Pi +I 64 matrix sequence with M, and similarly T and T I P 2 P 2 +l k with M„ , set k = k-2 , and let E = |T. with approp- i = l rlate relabeling. Go to step 4. Let the vector a = (a_, a ...... a ) represent the ordered list of 1 m v the common dimensions of A. (i.e., a. x a.) . Fiqure 3.1 demon- i i - 1 i strates an application of the parsing algorithm on a sequence of matrix products which is in the form A*CRA* . (A. is a column vector and A is a row vector.) The symbol =*= means that many multiplies occur simultaneously, at the indicated level and the symbol yf means that many additions occur simultaneously at the indicated level and each is a culmination of additions performed by a binary tree. For example, see (A.A ? ) in Figure 3.1; at level 3, since a~ = 9, a = 6, and a~ = 4, 216 multiplies are completed in w = 3 units of time, and at level 9, the 36 elements of the transformation (A A ) are completed in w loq f a. =6 additional units of time, since w =2 and a 2 1 I ' I a a, =6. Clearly, the algorithm is also applicable for extended matrix pro- duct expressions which are of the form RA*C . After each component of the canonical form E' = E(RA*C)*!]A* is determined by the parsing algorithm, the scalar products at the end are found in accordance with the rules established in Chapter 2. 65 E = ((A A ) ((A A ) A )) ((A A ) (A A )) 12 34 5 67 89 Figure 3.1: E = "[J A. , a = (9,6,4,3,1,8,15,3,6,9), w = 3, w =2. m a 66 4. SCHEDULING FOR A WEIGHTED-NODE DIRECTED GRAPH 4. I . I ntroduct ion The parsing algorithms developed in Chapter 2 produce weighted- node trees. If common subexpressions are combined, then a weighted- node graph results. Hence it is imperative that a scheduling algorithm be developed and a least upper bound (LUB) on the number of machines needed to complete the job be determined. In this chapter, algorithms are presented which provide solutions to these problems. Some of the work described here is based on a paper by Hu (12), where an optimal scheduling algorithm for unit-weighted-node trees is given. The rest is based on the critical path method used by PERT, as described by Kauffman (18). The work in this chapter was developed inde- pendently of, and simultaneously with, that done by Ramamoorthy, et al. (31). The significant difference is that a greater lower bound on the number of machines required is determined here and an optimal scheduling algorithm for any number, k , of machines is presented, which requires 2 0(n ) computer operations* to complete. Schwartz (34) also developed a scheduling algorithm for a weighted-node graph. It is similar, in concept, to Hu's algorithm in that the nodes on the longest queue are processed first. Schwartz's * We define computer operations" to be the operations add, subtract, compare, etc. which are in the instruction repertoire of most computers . 67 algorithm, however, is only sub-optimal and is presented in this chapter 2 as algorithm 3. While it is also an 0(n ) computer operations algo- rithm, it is about twice as fast as the optimal algorithm. We are concerned with scheduling atoms of work, or tasks, in a non-preemptive manner on a set of machines where any one machine is capable of performing any task. Later the problem of scheduling special purpose machines, of several types, is discussed and a scheduling solution indicated. It is also assumed that the amount of work (or time) for each node is known a priori and that information transfer among the machines requires zero time. Finally, it is assumed that the directed graph describing the job is reduced; i.e., that all cycles, or circuits, in the original job description have been reduced to a s i ng le node. Nodes in the reduced graph which represent circuits in the original graph are analyzed separately. For example, suppose that some FORTRAN program, which contains DO- loops or IF- loops, is the job to be analyzed. These loops are represented as circuits in the graph. Analysis is performed by expanding the loops until they are loop-free, in which case they may be represented by a reduced sub-graph. In other words, a circuit-free graph may be produced either by reduction or expansion. 4.2. Lower Bound on the Number of Machines Required We wish to determine the minimum number of machines that are re- quired to process some job in minimum time. The job consists of a 68 certain set of tasks, whose order dependency may be described by a directed graph, G . Each task is represented by a node of G , the dependency is represented by directed arcs between nodes, and the time to complete a task is the node-weight. It is assumed that these weights are integers. All circuits in G are reduced to a single node or expanded to obtain an acyclic graph. In all cases during the following discussion, let there be n nodes in G . Def i ni tion 4.1 ; A starting node is a node with no predecessors and a terminal node is a node with no successors. G may contain several nodes of each type, and G may be separ- able into two or more independent subgraphs. Also, if n is a pre- j decessor of n. then write n. )> n. , which is equivalent to sayinq i j i that n. is a successor of n .or that n <^ n. . 1 J i J Definition 4.2: Let G be the relaxed graph of G . G^ is defined R — L - R in the following way. All terminal nodes in G are placed in the low- est level, q , of G . All nodes in G which have successors only R in the set of terminal nodes are placed in level q- I . Repeat this process in decreasing level order j , so that all nodes in G which have successors only in the set of nodes in levels greater than j are placed in level j of G . When the unassigned nodes in G are all starting nodes, place them in the next level, thus establishing the value of q so that the levels of G are labeled l,2,...,q . R 69 Include all connecting arcs in G in accordance with those arcs in R G . Each node in G is said to be tightly connected to some terminal R node. G is, of course, isomorphic to G . R For each node, n. , i = l,2,...,n, let W. = w. + max{W . | n . < n.}, where w. is the weight of n. . Wj is the largest node-weight-sum of all paths between n. and the set of terminal nodes. Furthermore, Let D = max {W. In. e G} be the critical path value, or critical time q II of G , i.e., the least amount of time in which the job described by R G may be completed, and D is achievable when an arbitrarily large number of machines is available. Suppose we can cut G R in two pieces at some level j such that H. consists of levels I through j of G , and H' is the graph of J R j the remaining levels, j + I through q . Then D. is the critical time of H. . Let J = {kin e H.} and P. = 7 w. , i.e., the sum of J k J J ieJ ' node weights in H. . The following theorem produces a lower bound on the number of machines required to process the job in time D Theorem 4.1: If m- I < max{P./D I i < q} then at least m machines I i ' — ^ are required to process all nodes in D time. Proof: Let j be a value of i such that P./D. = max{P./D.|i < q} . J J J i i ' - H Since G D is relaxed, so is H. . At time D. the total mass removed, K J J 70 (m-l)D. , is less than P. by assumption. Since H. is relaxed, \J J J there is at least one node, n. e H. , which is not completely pro- K J cessed; let w/ > I be the unprocessed portion of n . Let D' k - k q _j be the critical time of H' . Hence the total time to process G_ J R is T > D. + w' + D' ~ J k q-j Si nee D. + D' . > D , then j q-j - q T > D + w,' > D . Therefore it is impossible to process all nodes in - q k q D time with m- I machines. q Q.E.D. Theorem 4.1 gives a lower bound on the number of machines required which is not smaller than the lower bound found by either of the methods of Hu or Ramamoorthy. However, we might ask, "Does the number of com- putations required to calculate m preclude its use?" For this reason, and because the W. are required for later parts of this chapter, algorithms are presented which find W. , D. , and P. . J J One could use the Bel Iman-Ka laba (18) algorithm to find the set 2 W = {W,,...,W } but this requires 0(kn ) operations where k is the In number of iterations required for convergence. If, on the other hand, one starts with a relaxed graph which is represented in list form (i.e., associated with each node is a list of its successor nodes), then the equation W. = w + max{W.|n. n. , then the set of D.'s require at most q0(l/2 n ) word- J J operations if B is upper triangular. Representation of B as a bit array on computers with a powerful set of binary instructions could further reduce the order of operations required. 4.3. Scheduling Algorithm for k Machines Scheduling of machine utilization, or assignment of nodes, which represent work to be done, to machines is essentially a mapping problem, That is to say, one could assign a different machine, taken from an arbitrarily large supply, to each node of the graph and thus complete the job in the critical time , C . However, it is unrealistic to consider an arbitrarily large machine resource, especially since each 72 machine assigned is idle most of the time. So, one could begin sharing the machines among certain nodes; the machine space is thus reduced, but now one must consider the dimension of time as well. One obvious mapping strategy would be to assign one machine to a I I of the nodes on a critical path, thereby still ensuring job completion in the critical time, C . Let a be a mapping procedure, a:M X T -*■ M XT (i.e., a mapping from the time-space domain of arbitrarily many machines onto the time- space range restricted to k machines) of the tasks on an oriented, reduced graph G such that the job described by G is processed in the least possible time ^(k) . In other words, a is an optimal scheduling algorithm for k machines. Let 3 be some other scheduling algorithm for k machines such that the job described by G is com- pleted in time 0,(10 . Lemma 4.1 : (k) >_fi (k) >_ C Q . Definition 4.3: L et a time-slot in M k X T of a node n. be the period of time w. that a particular machine consumes while processing node n. . Let y be any mapping procedure, y:M X T -»• M XT, and Wj to be the time, as measured in M XT, between the beginning of the time-slot of n. and the end of the maximum time-slot of the set of terminal nodes. The prescript y indicates that it is a function of the mapping procedure y . Since the largest node-sum-weight, W. , represents a lower bound 73 on the time between n. and job completion time, the following lemma fol lows di rect ly . Lemma 4.2: W. > W. for i = l,2,...,n . Y i - i The job completion times are defined as follows: G (k) = max( W |fy W 2 ,..., Y W n ) , and (4.1) «,(k) = max( W. , W ,..., W ) . (4.2) f a I a 2 an Furthermore, W. > w ; + max ( W.ln. e G ,n > n.} . (4.3) Y j - J Y l i j i The relationship (4.3) suggests that a mapping procedure, y > be accomplished in stages. As each node is assigned to some time-slot in M XT, the bounds on the time-slots which are available for its pre- decessor or successor nodes are determined. Let G. be the graph of one of these stages. Structurally, G. is isomorphic to G and consists of a subgraph G which is the set of nodes already assigned k A,! to M XT, and the remaining graph G! . See Figure 4.1. For each G. , one could determine a critical path through G. and a critical time, C. . i Since the critical path of a graph is the set of nodes which most urgently require processing when their time has come, and since G . A, i represents the space where work may be performed by the k machines, the following definition is presented. 74 Figure 4.1: G. , where G A . = M XT and G! = G. - G a i A, i i| A,i 75 Def i ni tion 4.4 : The critical path of G. passes from G| into the top and out the bottom of G . A, l This may be achieved in any way possible; for example, if n. is the last node on the critical path not in G . . , then it is neces- A, i sary to find a time-slot in G„ . which is not smaller than w. , A, i J but such that for all n, < n . W, is not greater than the time at k j' y k the end of the time-slot. Hence, the critical time C. > C. . . i — i- 1 Let d = C. - C. , . i i i-l Lemma 4.3: Let G , determined at the final iteration, f , of A,f procedure y , be isomorphic to G . Then, f (k) = C = C_ + f (k) = C = C n + V d. . f .*•. i f Lemma 4.4: (k) = ft,(k) if and only if yd. is minimized. f f ? = | ' Lemma 4.4 defines the conditions which would make a scheduling algorithm optimal; lemmas 4.2 and 4.3 and equation 4.3 indicate that an algorithm which starts with the terminal nodes would facilitate computation. The algorithm a , which is given below, performs a look-ahead at each iteration of one level in G. to see if the critical path i splits and thus reserves v machines for this purpose. If all node weights w. are of the same order of magnitude (i.e., max {w. | i <_n} < 2 * min {w.|i <_n}) one level look-ahead is suf f icient, 76 but not conversely. If more than one level look-ahead is required, then appropriate changes to a would be required at steps 5, 6 and 7. Algori thm a : Assume that an n x n incidence matrix B , where b. . = I means that node n. < n., of a reduced directed qraph G ij j i a having n nodes; a vector of weights w = (w ,w , . . . ,w ); and k are given. Then create k lists (L's) and associate with each L i a scalar, t. (i = l,2,...,k) . Also create a vector W = (W ,W ,...,W ), a vector A = (6 .8~, . . . ,& ) , and a vector Y = (Y,,Y„,...,Y ) . I 2 n 12 n 1) Set t. = (i = l,2,...,k) , C = the critical time of 6,6=0 for i = l,2,...,n , and let G' = G . Calculate the vector Y , where Y. is the largest value of the sum of nodes on all paths between node n. and the set of starting nodes. Starting with level I of G and proceeding in order to level q , if n. R i is a starting node, then Y. = 0; otherwise, Y. = max{w.+Y.|n. > n. , n. e G} . i J J 1 J i J 2) Find min (t ,t ,...,t ) and identify the p <_ k L's which sat- isfy t = t = ... = t = min (t , ,t_, . . . ,t. ) . Let n. P| P 2 P p 12k j be the last node on the list L ; for each non-empty L set column j i n B to zero. 3) Identify the set of terminal nodes in G' as Q = {q,,q 9 ,...,q }; l ^ q i.e., n. e Q if b..=0 for j = I ,2, . . . ,n. If Q = , then for i = l,2,...,p , set t = min(t ,t ,...,t ) and go Pi p p+l Pp+2 Pk to step 2. 77 4) If n is the node at the top of L for i = p+l ,p+2, . . . ,k J p i (the L's not satisfying the min condition in step 2) set the jth column in B to zero, but save the column J for later restor- ation. 5) Identify the set of terminal nodes, which have been augmented by step 4, as R = {r ,r ,...,r } . Calculate the W (i = l,2,...,r), i such that W a r. w + t i f n e Q r i P| r i w + max{ n ,W.|n > n . , n.eG} if n t Q . J 1 r-j J' j r. 6) Let M = max({ W +Y |r. e R}) , S = {n r Ir.eR and V\L +Y^ = M}) ii i ii S' = sRq , M = max(M,C) ,and d=M-C. If d>0, then if any n. e S f was previously denied assignment to G. in step 8 such that < 6. < d , then select a node n e {n.ln e S' and (6-6.) is maximal} , restore the state of m J J J G' and G A at the time of the n assignment denial, and go to step 8 at the point where n m is pushed into L , etc. marked + . 7) Set C = M . If (S - S') = , then set v = ; otherwise let u = min({W.-w.|n. e (S-S')}) , U = {n.|Wj-w. = u, n. e G'} , J J J J J J J X = {i|t. <_ u, i = l,2,...,k} , and set v = max(0,p+| u|- 1 X| ) . If p <_ v or S ' = v 78 and |S'| > , for w. ■= max({W. | n. eS' }) push n. Into L , set J J P p S' = S' - {n.} , W. = t = t + w. , p = p-l , and elimin- J J Pp Pp J ate the jth_ row from B ; G' represents the new graph with n. removed and G represents the new graph with n. assigned; A J the dependency of G. from G' , i.e., G' -*■ G. , is in accordance with the i nways of n. in G . Go to step 9. 8) Let M = max({ W + Y In. E Q}) , S' = {n |q. e Q and a q. q. |M i v q. |M i a W + Y = M} , and P = p . While p > and |S ! | > , for w. = max({w.|n. e S'}) set S' = S' - {n.} and 6 = W.-u ; if 6 < then ^ush n. into L , set t = W. , p = p- 1 J Pp Pp J and eliminate the jth_ row from B ; otherwise, if 6 <_ 6 . or 6 . = then set 6 . = 6 and save the state of G' and G R for J J A possible later restoration. If p = P then set t = min(t ,t ,...,t ) for i = l,2,...,p . Pi Pp+I Pp+2 Pk 9) For all b. . £ B . if b..=0 then go to step 10; otherwise, re- I J i J store the columns set to zero in step 4 and go to step 2. 10) Let S be the set of nodes in G' . Let K= {kin, e S} and k k M = max{t. | i = l ,2, . . . ,k} . If kM - £ t. < £ w. , then go to step 2. i=l ' jeK J Assiqn the remaininq n. e S to G n , setting t. = t. + w. and j A ' i ! J S = S - {n.} as the assignment is made, such that when S = $ then C(k) = max{t . ,t , . . . ,t, } is minimized. This can be 12 k 79 achieved using Johnson's algorithm (13) where one starts with mi n{M-t. I i=l ,2, . . . ,k} > , and looks for a node n. such that i i w. = max{w.|j e K or a set of such nodes Z such that Y w. n .eZ J is maximally less than or equal to min(M-t.) . Figure 4.2 illustrates how algorithm a iteratively maps G into G . Examine G v , G., and G . When G^ is analyzed by a , it is A 3' 4' 5 3 ii* discovered that the critical path splits at the top of G , passing A, 3 through nodes c and d . Since neither is a terminal node of G' , a finds that node h terminates the longest path in G' . However, since two machines must be reserved for nodes c and d and the time available is only 10 units while w = II , i.e., 6. =.l , node h is not assigned to G at this time and the second machine is kept A, 3 idle 10 units instead. The analysis of G determines that both c and 4 d should be assiqned to G„ „ . However, the analysis of G,_ indi- a A, 4 r 5 cates that the critical path bypasses G for d,- = 12 units of time, due to the node h . Since d^ > 6, , algorithm a returns to G^ and 5 h 3 3 assigns h to G . Thus the configuration described by G . f , is A, 3 4 ach ieved. 4.4. Computational Complexity of Algorithm a Bounds on the computational complexity of algorithm a are difficult to determine unless it is assumed that the restoration to a previous state G -*■ G as determined by steps 6, 7 and 8 never occurs. Let us A assume this case. During any iteration of algorithm a either step 7 or 80 O V <° S£i • 1 1 ** o\ I ^ CO 1 "° 31 g(30) 4 o 1 ^ CM r — *U * 1 V **v CD S CN o II .O .* (O 01 •— 1/1 •— a in a> E sz .c +- -H c *u L. O It) Ol Q. CM CD .c I © O z O m \ 1 ** 4B JZ, i "5 < O 1 p4 o w4 XL i < » I *4 3 C o o CN U J 5 6 < i < o u o *-« * o 2 < o r4 < o — 82 0» 3 { mi M i mm < • 5 § f ma 2 1 U p4 m* aZ I 9 < • i f 5 T3 3 C C o u O I (0 ^ _^ ^ , ,-, ^-^ 8 m* 8 !-• 1 u e. 9 ^ JO 8 f m* «* V «•» "■" (N ■** U 3 o o 1— • I 9 ma < 6 it £i 1 ■ha ma 83 step 8 is executed, but not both. Furthermore, it is assumed that the basic iteration of a loops from step 9 to step 2, and that word oper- ations may be used to determine that a row of B is zero. Table 4.1 lists the number of operations of each step. 1) 0(n 2 /2) 2) 0(k) + 0(p) £ 0(2k) 3) 0(n) 4) (k-p)O(n) £ (k-l)O(n) 5) 0(n) + 0(r) 6) 0(r) + 0(qr) £ 0(r 2 + r) 7) 0(p)<_0(k) 8) 0(q) + 0(p) <_ 0(r) + 0(k) 9) 0(n) 10) 0(r 2 ) Table 4.1: Number of Operations of Steps in Algorithm a The computational complexity of steps 2 through 9 is not greater 2 than 0((2 + k)n + 3k + 3r + r ) . An upper bound on the overa I I number of operations in a , #(a) , may be obtained by assuming that one node is assigned during each iteration of a. We may safely assume r< 0((2+k)n 2 /2k + ( l+7k/2+3r+r 2 ) n) = 0((2+k)n /2k) . (4.7) By equations (4.5) and (4.7), the following lemma holds: Lemma 4.5 : 0(n 2 ) >_ #(o) >_0(n 2 )/k . The assumptions made at the beginning of this paragraph are, in fact, not unrealistic. As the value of k approaches LUB(k), a d. > less often and hence fewer restorations to a previous state 2 2 must be performed. Steps I and 10 of algorithm a require 0(n /2+r ) operations, and hence the bounds given by Lemma 4.5 are valid for algorithm a from step I through step 10. 4.5. Least Upper Bound on the Number of Machines Required The open question remains, "Given a graph G with critical time C , what, is the least upper bound on the number of machines required to process G in time C n ?" In other words, what is the lowest value of k such that fi,(k) = C^? The following lemma is used to estab- f I i sh the bound on k . 85 Lemma 4.6: If Y d. = . then fiJk) - C . , i f i = l Therefore, we can use algorithm a to find a least upper bound on k in the following way: 1) Use Theorem 4.1 to find a lower bound on k . That is, set k = m . 2) Use algorithm a , given k . If any d > in step 6 of a, then set k = k + I and repeat step 2. Otherwise STOP. Since the computational complexity of algorithm a is quite ex- tensive, it's i ndiscrimi nant use could be quite costly. However, since Theorem 4.1 determines a lower bound for k which is not smaller than any other known lower bound, the number of times algorithm a is required to find k* = LUB(k) is small. In fact, we have found that a k* is determined within a few iterations. For example see Figures 4.3 and 4.4 for a relaxed graph G R mapped onto G. with k = 5 . Furthermore, if algorithm a is only being used to find k* , then it is never necessary to restore G. to some previous state G.(j < i) and hence those computations in step 8 where 6 > need not be per- formed. 4.6. A Fast, Non-optimal Scheduling Algorithm While algorithm a is optimal for k machines, it is an iterative process, terminating when all nodes have been assigned to G , and A each iteration requires extensive computation. It is desirable to have 86 P|/Oi 59/14 109/2 v(2) w(4) x(3) y(6) z(7) 131/3' Figure 4.3: A Relaxed Graph, G R . P./D. <_5, \ = \,2,..,,5 87 mach i ne I 30 mach i ne 2 mach i ne 3 mach i ne 4 mach i ne 5 21 b(9) i(5) o(8) s(4) A(2) v(2) g(3) h(3) k(4) p(7) n(5) A( I) t(3) w(4) f(4) d(5) j(4) r(4) u(7) A(3) x(3) e(3) c(5) 1(7) q(2) A(7) y(6) a(7) m( 10) A(6) 30 z(7) 23 13 Figure 4.4: G (k=5) of G in Figure 4.3, by Algorithm a A R 88 a faster algorithm which need yield results that are only slightly less than optimal. It turns out that if some k ? ^_ k* is given, then the following algorithm, which is Schwartz's algorithm, serves this pur- pose. Algorithm g : Assume that an n x n connectivity matrix B of a re- duced, directed graph, a vector of weights w = (w,,w ,...,w ) and 1 2 n k' are given. Then create k' lists (L's) and associate with each list a scalar t , ( i = !,2,...,k f ) . Also create a vector i w = (w w W ) > where W. = w. + max{W . I n . < n.} . 1) Set t. = , for i = l,2,...,k' and calculate the vector W . 2) Find the mi n(t ,t„, . . . ,t ) and identify the p <_ k' which I Z K satisfy the min condition as p. ,p Q , . . . ,p . Let n. be the 1 l P J node at the end of L ; for each p , i = l,2,...,p , P i j eliminate the row in B which corresponds to n , and W J J from W . If B is empty, then STOP. 3) Let M = max({W |W e W}) , and S = {n.lW. = M and the ith col- i i l ' ' — umn of B is zero} . While p > and Isl > , for w. = J max({w In. e S}) add n. to list p , set S = S-{n.} , p = p- 1 , i i J P J and W = . If Isl = , then set each t J Pi rninCt ,t ,...,t ) for i = l,2,...,p . Go to step 2. P p+I Pp+2 Pk Figure 4.5 illustrates the assignment of the nodes from G R of Figure 4.3 to k* = LUB(k) machines, represented by G , using a B 89 14 22 26 30 31 b(9) a(7) fC4) e(3) h(3) c(5) d(5) g(3) k(4) m(IO) 1(7) j(4) i(5) p(7) r(4) o(8) q(2) n(5) z(7) A(4) u(7) t(3) 5(4) x(3) y(6) w(4) v(2) 12 24 27 29 31 mach i ne I mach i ne 2 mach i ne 3 machi ne 4 mach i ne 5 Figure 4.5: G (k=5) of G in Figure 4.3, by Algorithm 90 algorithm 8. Note that 0(5) = 31, whereas ft (5) = 30 . One could im- prove algorithm 8 slightly by looking ahead at the time each set of starting nodes is assigned to check for a split in the critical path; however, then the computational complexity would be increased beyond that of algorithm a. 2 Algorithm 8 is still 0(n ) , even though we call it a "fast" 2 algorithm. The three steps of 8 require 0(n /2) , 0(2k) , and 0(n+k) computer operations respectively. Using the same reasoning as for algorithm a, we find that 4 0(n 2 /2) > #(8) > 0(n 2 /2k) for steps 2 and 3. However, step I requires as many operations as the upper bound of #(8) , so in this case #(8) = 0(n 2 ) no matter how many nodes are assigned each iteration of 8 . However, because of the coefficients in these computations, #(a) - 2 * M&) , under the best circumstances and #(a) - ((3+k)/2).#(0) , under the worst circumstances. 4.7. Multiple Special Purpose Machines Suppose a job described by G consists of tasks (nodes) which re- quire the use of one of m special purpose machines. Let there be k ,k ? ,...,k of these machines respectively to perform the work. Is it possible to assign the tasks to the machines such that the scheduling is optimal? Yes; an algorithm similar to a would achieve this goal. It is necessary to maintain m sets of machines (L's) and at each iteration of a , the number of available machines of each type is determined as p, p,..., p respectively. At step 7, where it is I 2 m determined that the critical path splits into v paths, it is nec- essary to break this down into ,v,.v,..., v paths for each machine I 2 m type. Then, in accordance with the strategies of steps 7 and 8, node .n. , where the prescript . denotes that the node is type J , is J J added to the end of L J P .P J It is probably easier to understand the changes outlined in the paragraph above by an example rather than rigorously modifying algo- rithm a . Suppose a job described by the graph in Figure 4.6 is given, and suppose that two different kinds of machines are required: an arithmetic unit (AU) and a memory unit (MU). Let each starting node be a "FETCH" operation requiring the use of a memory unit and each terminal node be a "STORE" operation also requiring a memory unit. Let all other nodes be operations requiring the use of an arithmetic unit. Figure 4.6 is a relaxed graph of the FORTRAN program I NT I = A-C INT2 = E-A*B INT3 = F-C*D Q = ( 1NT3- INT2)/I NT I R = (A*INT3-C*INT2)/INTI INT4 = P*(Q-D)+K*L+M*N 92 A B CD Figure 4.6: G of a FORTRAN Program R 93 INT5 = 0*(Q-B)+G*H+l*J INT6 = H*L-J*N S = ( INT4*N-INT5*J)/INT6 T = (INT5*L-INT4*H)/INT6 END . We assume that the intermediate variables I NT I , . . . , I NT6 do not re- quire use of the MU . Let the fetch-store weight w = 2 , the add- subtract weight w = 2 , the multiply weight w = 3 , and the divide a m weight w = 5 . The critical path is indicated by bold connecting arcs, and the critical time is 33. Since the maximum width of the critical path of AU-type nodes In 4 and of MU-type nodes is 4 also, let us assume that there are four All's and four MU's available to pro- cess this program. Table 4.2 is a tableau of algorithm a on the graph from Figure 4.6. The order of the nodes listed under Q , the set of terminal nodes of G| , and the order of the numbers listed under Y and W , 1 a the longest path value to the set of starting nodes and terminal nodes respectively, are the same. The column marked t is the minimum height (in terms of time) of the AU's and MU's. Neither the machine delays which are inherent in algorithm a , nor the machine availability (i.e., p and p) are indicated in Table 4.2.. It is, however, easy to observe these facets by examining Figure 4.7 while following the tableau in Table 4.2. Figure 4.7 indicates the node assignments to the AU's and MU's by 94 vo CD 1_ 3 CO E .c O CD 3 ro CN (D J2 CO c CN < < < — CN •<3- CN ■^J- CN CN TT 3 * •» * 3 3 3 3 3 3 3 3 s: — — m < < < < < < S < * 3 3 3 * «. •* ft — n — ro m 3 m -f- 3 3 3 3 3 3 3 S rn ** o 3 O O < < < < < S < 3 3 -t- < +- -1- >- 3 O S s: •> O O O O O O < +■ -C O T3 3 01 3 Cl cr ^ 0. ~ > -f- •k Q) 4- o -Q ■» * * ft ft ^ CO o cc (D ro CO (0 > 1_ -H c O -C CD — O) CO ro ft N ro X3 ^ >. CO c E ft •^ CO ■* J= "O CD X N ID ro CO •V n ft •. 1- * •> * 3 0) 3 Q- cr -* Q_ •—5 ^ en M- o _Q * * « •» « ft ft •« CO o cr ro IU (D CO > [_ +- C O -c O O) CN in r* CT> O — — — ^~ CN •ft *. •% •» •« CN in r^ ON CN — — — CN CN * •i ^ ft •ft •ft CN •^t in VO 00 o o CN CN O CN CN CN ^ ^ •i •^ •ft *■ "ft •ft *. CN ^ ■ in r- CO o o O CN CN Ch CN CN CN CN •» ■t ^ ^ •* •» •l •t •• •ft •v CN r- On CN in in VD r- cn o O CN •v * * — CN CN CN CN CN r-- o> C7\ «* * •K •i « « ft ft •* ft ^ ^ * ^ CN CN in in r~- r^ CT\ o O M- CN CN CN r*» r*» CTv CA CN CN •% CN •» » CN CN ^ — •^ * ft — CN CN CN * CN tr\ ft * ■\ o •ft ft in — G\ m •* CN CN — * CN — •k CN ft ft en vO "vl- ft •* CN vD CN *CN CN CN * CN CN — CA ■k — •. CN ■ ft ft CN CN ft •> CN — CN •% CN ftCN CN CN — — vO «3" * •. ^ m ft O ■> * •ft * ■* CN CN IPl CN CN CN ft in ^CN CN CN Sf <* CN * * ft ^ * « VD ft •^ -s ft ft — r- r- LP\ in CN CN CN CN — in — CN CN ON o - O o CN CN r* r^ o\ CN CN , ^- in r-- r- as N *! (0 ro •* co CD >^ CO ro ^ CO CO •«, ft ft ft •ft ro •V X N N D_ N (0 JZ T3 •* * ro ft CO * (0 •ft •* 1- fD (0 5 (/) ro cr CO —3 (0 E N * •^ * •ft ft « ft ft ft ft ft ft en CD O Q) > L. N a. N — N — E * CO (0 ro cc OC ■i •v •. 3 3 3 o cr O E — ) — * * M- M- JD -Q o o ol (0 ro (0 ro -1- CO +- +- c O sz — ' — CO CN K\ <* LH VO r- CO ON o — CN rn •<^- - 95 "vT ZD •» S CN St •. ZD ZD ro 2 2 ZD * •* 'sT ST CN CN CN — ro zd ID * ZD ZD ZD ZD ZD < s CN < +- +- +- +- "O -O L_ CD •v •* ro * •V ■1 •k •v — E O ro i^ X M- 2 CD 2 ro LU < fD (D s ■0 Q * ro •1. "O •* c •. N ro O c _J TD ro O •* *. ro * a * E N T3 - D ro TD JD U_ CD •* •> C ro •^ * * * •. ■■ - E ro ro M X 4- 2 CD -z. ro LU < ro CN «h CN to LT\ — ro CN CN CN ro ro •^ •^ •» «\ •1 cn ro in MD ro CN CN CN CN ro ro •s. •^ •V •t * •s CN CN in in vO CO — ro CN CN CN CN CN CN ro ro •K. *i •K •* •v •>. ■t ■t CN CN st m v£> VD r- 00 ro CN CN CN CN CN CN CN CN ro ro CN •v CN CN O CN ^ *. «* #\ *t CN CN O O O O ■t ^ •» •V •, •* CN O CN O r^ to CN * ^ ^ *. ^ ■t •, •t CN o O O CN CN in O CJ> o CN ro st st in vo CO — CN CN CN CN CN CN CN CN ro ro ro (D •> ro N 2 -O Q •v ^ * *s ^ N E _J 2 L_ O •«. * ro •v *. «\ E ro —> H- X> ro CO * * •* •^ •* ^ •^ *, CD i^ X CD (D 2 O LU < VD r^ 00 o\ O CN ro st in — — CN CN CN CN CN CN 96 AU AU, AU, AU, 3 4 33 31 28 26 24 19 17 14 12 9 7 A(2) A(2) A(IO) A( 10) a(3) b(3) c(2) d(2) f(2) e(2) g(5) aa(2) m(3) z(3) 1(3) i (3) j(3) h(2) k(2) n(3) p(3) o(2) q(2) t(3) u(3) r(2) s(2) v(3) w(3) x(3) y(3) ac(2) ad(2) ab(2) ae(2) ag(5) ah(5) af(5) A(6) A(2) A(2) A(2) 23 20 15 33 31 30 MU MU 2 MU MU, 4 A(2) B(2) C(2) D(2) A( 1) A( 1) A(6) A(6) E(2) f(2) 28 A(2) A(l) N(2) 26 M(2) 24 H(2) J(2) L(2) K(2) 22 A( 1) A(2I) A(2I) G(2) 1 (2) 20 A( 1) A( 1) 1 9 0(2) P(2) \J 2 A( 15) A( 15) S(2) T(2) Q(2) R(2) Figure 4.7: Node Assignments to the AU and MU 97 algorithm a . The numbers in parenthesis indicate the length of the time-slot, and the literal indicates the node which has been assigned to that time-slot. A A-node symbolizes an idle condition, i.e., a period of time when no useful work is performed by the machine. Note that for all nodes n ; on the critical path W. = W. and hence 1 i a i four AU's and four Mil's are sufficient to process the graph in critical time 33. 98 APPENDIX TREE-HEIGHT REDUCTION PROGRAM 99 I* HTEXP */ STMT LEVEL NFST 2 "? *\ 6 7 B 9 to 11 I? 13 I* 15 16 17 10 19 20 71 22 23 24 29 31 32 33 34 35 36 37 38 2 39 3 40 3 /• HTEXP */ (SUBSCRIPTRANGEI: HTEXP:PPOC: OCL STRNG(200I FIXED BIN«151 EXTERNAL; DCL(0USTNG,INSTNG)(2C0> BIN FIXED (l5)_EXTj DCL ZERO BIN FIXED! 15) INITIAL (0001 EXTERNAL;" DCL OPAND BIN F I XED( 15 UNIT I AL (5001 EXTERNAL; OCL EQU BIN FIXED! 15) INITIAL !501) EXTERNAL; DCL SEC BIN FIXED!15» INITIAL <502) EXTERNAL; OCL LFFTP BIN F I XFD! 1 5) INIT I AL <503) EXTERNAL; OCL RIGHP BIN FIXED! 15HNITIAL (5041 EXTERNAL; DCL PLUS BIN FIXEDI15) INITIAL (505) EXTERNAL:" DCL MINUS BIN FIXFDJ 151 INITIAL (5061 FXTERNAH OCL MULT BIN F I XED! 15 ) INIT I AL 1507) EXTERNAL; DCL DIV BIN FIXEDI 151 INITIAL (508) FXTFRNAL; OCL COMMA BIN FI XFO( 15 ) INIT I AL (509) EXTERNAL; OCL EXPO BIN FIXFDI 15) INITIAL (510) EXTERNAL; DCL (SfMHTI BIN FIXE0U5I EXTERNAL; OCL ADR BIN FIXED(15)INITIAL (511) EXTERNAL; DCL (BKR(1W(30),BKCOL(30) » BIN FIXE0(15); CCL (SC,AD,SU,MU,DVI BIN FIXE0U5I; DCL WGHT(0:4) *IN FIXE0I15) INITIAL 12,2.3.5,1); DCL WO BIN FIXEDI15I : DCL W2 BIN FIXE0I15) ; DCL ARRAVIO: 19,0:1591 BIN FIXEDM5) EXTERNAL, EXP (0U9, 0:159) BIN FIXE0I15) DEFINED ARRAY; SC-SFC-EOU; AD-PLUS-EOU; SU-MINUS-EQU; MU-MULT-EQU: DV-DIV-EOU; W0>WGHT(0); W?«WGHT(2I; ARRAY-O; DCL TABLESI20) FIXED BINI15) EXT INIT((20)0); TABLES-C; BEGIN; DECLARE ( I.J.JP) BIN FIXE0I15), HAN BIN FIXFDI15) DEFINED S; DFCLARF COPVLINF COPYGRPS CRFATFGP AODGRPS 7GRP FVAL FORM MOCMP FX2 CSGRPHT DECLARE B19 BIN B20 BIN FIXFD!15) INITIAL(20I, BPT BIN FIXE0I15) INITI AL( 157) , LIM BIN FIXE0I15) INIT(AL(159); rr)PYLINE:PROCFDURE (I.IP.J.JP); DECLARE (I,IP,J,JP,K) BIN FIXE0U5) DO K«0B BY IB TO EXP(1,0B)-J; FNTRV(BIN FIXED! 15), 8IN FIXEpil5I.BIN FI XED( 15) .BIN FlxeD(l5)> ENTRV(BIN FIXE0(15),BIN ENTRVIBIN FIXFD( 151 .BIN ENTRYIBIN FIXED( 15) .BIN FNTRVIBIN FIXE0(15),BIN FNTRYIBIN F I XED< 15) • BIN ENTRY(B!N FIXEDd5I.BIN ENTRYIBIN F I XFD( 15) .LABEL ) , ENTRY(BIN FIXEDI15II RETURNS ENTRY(BIN FIXEDU5I) RETURNS F1XED(15) INITIAL(19), FIXE0(15),BIN FIXED(15),BIN FIXEOdBI) FIXEDd5I.BIN FIXE0I15I.BIN FIXE0(15I> FIXED! 15), BIN F I XFDI 15) , BIN FIXEDI15M FIXE0I15),BIN FIXE0I15I.BIN FIXFD<15)> FIXE0(15))RETURNS(BIN FIXED! 151), FIXE0I15)), (BIN FIXEDI15JI, (BIN FIXE0(15)); 100 /• HTEXP •/ STMT LEVFL NEST 41 3 1 43 3 44 3 45 3 46 2 47 3 46 3 49 3 50 3 51 3 53 3 1 54 3 1 55 3 5ft 3 1 57 3 1 5<» 3 60 7 61 3 6? 3 63 ■x 64 3 65 3 1 67 3 6S 3 6«» 3 70 2 71 3 72 3 73 3 74 3 1 76 3 77 3 78 3 7<» ? BO 3 81 3 82 3 83 3 84 3 1 86 3 87 3 88 2 8; /*T YPE-WGHTIO I 1 I 2 I 3) JPT-LOC ISTGP« ( DECLARE I I ,PT,TYPEtHT,K,GR> BIN FIXE0I15); EXP( I .PT-1RI-TVPE: gp«modiht,type» ; do k-ob hy ib to type-ib; fxpj i,k*pti«cb;END; fxpi i ,k*pt»«ht; EXP( I,GR*PT)-EX2(HT/TYPEI; END CRE»TEGP; AODGRPSt PROCEDURE II.IP.J.PTI; • /«J BY NAHE»/ DECLARE (ItlPtJtPTfKI BIN FIXE0I15); J«J*10B; 00 K«OB TO FXPI IP.PT-1B1-1B; EXPI IP,K*PT>«EXP( IP,K*PT1*EXPII ,K*J»I END; FXP( IP,K*PT)--1B? J-J*K*19; /*J-1ST LOC AFTER OLD STRING*/ FNO ADDGRPS: ZGRP: PROCEOURE I I , PT, OP, WGHT > ; DECLARE « I,PT,OP,WGHT,KI Bl N F IXEDI 1 5 I ; FXPI I,PT-10B)«OP; EXPI I ,PT-1BI»WGHT; DO K*PT TD PT*WGHT-IB; EXPII.KI-OB; END; EXPI I ,KI«-10B; ENO ZGRP; EVAL: PROCEOURE (I.JI RETURNS IBIN F1XEDI15I); DECLARE II ,J,M,FIN,CLS,K,K1,TMP<0:9) ,CNTI B IN F I XEDI 15 t I '• HTEXP ♦/ 101 STMT LFVI 90 91 93 94 9* 97 9B 99 100 101 103 104 105 106 107 109 110 111 11? n* 116 117 IIP 119 120 t NEST IB; CNT-EXPI I.EXPII, J-1BKJU IF CNT>-06 THEN RETURN(CNT); 00 K-OB BY IB TO EXP(I,J-1BJ- TMPOB THEN OOj TMP(K1»TMP(K»*CLS; CLS-OB; ENO; F ik»f in*tmpM»M{ CNT-CNT»1B; CLS-CLS*CLS; GO TO EVALl; fno eval; 121 12? 123 124 125 126 127 129 130 131 132 133 134 135 136 137 138 139 140 141 14? 143 FORM: PROCEDURE {TYPE, II: OECLARE SK I Pj RETURN; ENO; ELSE IF I EXPCB19, J*lB» ,8 IN F j X EOI 1 51 , 8 IN FIXE0I15II, OVFVAL ENTRY (BIN FlXEDI15)li 1*7 3 COPYGRPT:PROCEDURE II.IP.J.JPl! /*J,JP BY NAME*/ 148 4 OECLARE (I.IP.J.JP) BIN FIXEOI15U 149 4 FXP! IP,JP)-FXPIl,JI J 150 4 C XP< IP,JP»1B)-EXP(I,J*IB» J 151 4 JP-JP*10Bl 152 4 IF FXPI I rJ+lBKOB THEN 00; 154 4 1 J»J*10B: 155 4 I TFRMS-TFRMS*1B; 156 4 1 TLOC(TERMSI»JP-lB; 157 4 1 END; 158 4 ELSE DO J»J*10B BY IB TO EXPI I , J*l B) ♦J*10B t 159 4 1 EXP(IP,JP)«EXP!I ,JI : 160 4 1 JP«JP*1B; END; 162 4 END COPYGRPT? 163 3 MH0LEDIS:PROCEDURE (II.L1.L2) RECURSIVE; 164 4 nECLARE NDISFLG BIT! 11} 165 4 OECLARE I1I.CLSI BIN FIXE0I15H 166 4 OECLARE (L1.L2I LABEL! 167 4 CALL M0CMP(U,FRR0R»1 168 4 NDIS-BPT-W0-1B! 169 4 FXPt II,LIM|«CB! 170 4 FXP( I I,LIM-lBI«OBi 171 4 EXP< 1,MP*W2)«-1B; 17? 4 H-U 173 4 TFRMl: FIK-O; 174 4 DC K*OB BY IB TO H2-1B; 175 4 1 CLS«EXP(I,K>MPICM; 176 4 1 EXPU ,K*MP)«EXP(I ,K*MP1-CLS; 177 4 1 FIN-FIN+EXPI I.K+MPIi 17B 4 1 IF CLS>0 THEN 00! 180 4 2 NOISFLG-IBi 181 4 2 JJ*1B; 182 4 2 JJP-1R; 1«3 4 2 CALL ZGRP(II,NDIS,EXP( II, II, WO); 184 4 2 OISTRIB: IF EXP! I I , J J >«SC THEN IF NDISFLG THEN 00; /»IE NEVER OISTRIB*/ 187 4 3 DO JJ-K*11B BY W2*11B TO EXPI II, OBI; 188 4 4 FXPU I,JJ)-EXP!II,JJI-CLS! END! 103 /* HTFXP */ STMT LFVEL KEST 19C 4 3 191 4 3 192 4 3 19? 4 . 3 19? 4 3 196 4 3 197 4 3 198 4 3 199 4 2 20C 4 3 201 4 3 20? 4 3 203 4 3 204 4 3 205 4 3 206 4 3 207 4 3 ?CR 4 1 209 4 3 DISTRIB1 210 4 3 211 4 3 212 4 3 213 4 2 214 4 2 215 4 2 216 4 ? 21« 4 3 219 4 3 220 4 3 221 4 2 22? 4 3 223 4 3 22* 4 3 226 4 3 227 4 3 22R 4 3 229 4 2 210 i 2 ?32 4 TFRMEND: 234 4 1 235 4 1 716 4 1 239 4 7 24C 4 2 241 4 ? 2*2 4 1 243 4 1 244 4 1 245 4 EXP(I,K*MP)«CLS*EXP(I,K>MP); FXPUI.HMI-EVALI ll.NOIS); FXP(II,LIM-lB»«lBl IF TERHS«1BCD P>0B THEN GO TO LU_ r.fUMriO(EXPm,L!M>,W2M EXP«EXP» RETURN; END! ELSE DO; IF EXPU!,BPT-1BJ — 10S THEN GO TO OISTPIBl; " /*IE ALWAYS OISTRIB*/ FXP(II,JJP»-EXPUt f N0IS-21; /"SIGN*/ JJP-JJP*2; CALL CREATEGPnitJJPtW2,EVAL< U.NDISI I ; /*m-F0RM*/ EXP(II,K»JJP1-EXP(II,K*JJP»*CLS; JJP»JJP*lB*M2j FXPU I.JJP-IBI — IB; FXPH I ,Lf M-1RJ-10R) EXPIIl.JJPI-SC; FXPU ttOBI-JJP; GO TO TERMEND; END; HT-EVALJ II,JJ*2U EXP( II,K*JJ*2I-EXP< II,K*JJ»2»*CLS; FXP( II, JJ*10B*W2> — IB; IF EVALI II,JJ*10BKHT*W2 THEN 00; CALL COPYGRPSIII.II.JJ.JJP); /*DI STR I BUTEO*/ NDISFLG-OB; FND; ELSE DO; GR«M0D1HT,W0I ; /• NO OISTRIB*/ FXP(U,GR*NDIS)-EXP( 1 1 ,GR*NDI S ) *EX2 ( HT/WO) ; IF EXPIII, JJI—FXP1 It,NDIS-2) THEN EXP III ,NDl S-21-A0; EXPUI.BPT-lBl — IB; jj«jj*1 1H*W2; FND; GO TO OISTRIB: END; END; IF F1N«0 THEN DO; FXP( !,MP-10BI«EXPII ,MP-10B)*B?0; MP--1B; IF TERMS-IP. THEN IF 0P<0B THEN DO; CALL COPYLINEI I I. I.1B.EXPI I.OBII ; RFTURN; END: ELSF GO TO LI; GO TO L2; END; M«M*M; 04 /• NTEXP •/ STPT LFVEL NFST 2*6 2*7 Gn TO TERH1; END MHOLEOISl 2*8 3 FINDBIT: PROC EDUREI CLS f K, CNT I } __ 2*9 * OECLARF (CLS.K.CNTI BIN FIX 10(15)1 ?<50 * CNT-OBt 251 * AGAIN! FIN-OB; 252 * DO K-OB BY IB TO W2-IB; 253 * I CLS-TMPIKICM; 25* * 1 IF f.LS>OB THEN 00 J 256 * 2 TMPIKI-TMPIK1-M; 25? * 2 RETURN; 258 * 2 END; 250 * 1 FIN«FIN4THP(KI; END; 261 * IF FIN-OB THEN RFTURN; 263 * CNT-CNT*IB; 26* * M-M*M; 265 * GO TO AGAIN; 266 * FNO FINCBIT; 267 268 260 270 271 272 27* 276 277 278 270 2B2 DVEVAL FTNOHT: IF END FINtHT; PROCEDURE(DH» RECURSIVE; OECLARF IDH,GR,NH(0:2l,CtS{0:2,0:2M B IN F IXEDt 151 , FINDHT ENTRYIBIN FJ XFDU 5) , L ABEL I ; PROCEDUREtN.L J; OFCLARF N BIN FIXEHIIS), I LABEL; CALL FIN0eiT(CLS(N,0l,CLS(N,lt,CLS(N,2)l; IF CLSIN,0)-OB THEN DO; IF TFRHS>0B THEN GO T(5 Lj NHIN)«DH*1«S END; FLSF NHIN»-W2*CLS(N,2»*CLSIN,1); TER"S>OB THEN IF NH(N I >-EXP( I I, L I«l THEN GO TO Li /•JAK NEXT S •/ 2B3 2B5 286 287 288 200 201 202 203 20* 205 296 297 298 300 IF TERHS>OB THEN 00; ii-abs(exp( i.Tincmn ; HT-CSCRPHTUI1; FND; IF MP<0B THEN DO; DIVDIST: /*DIVOIST*/; CALL CREATEGPUTtlOBtW2>EXP(1I ( lfM|*WGHT(3l It FXP(IliO>»W2*100Bt EXP( II.FXPI II.OII-SC: RETURN; FND; GR«H0D(DH,W2 JJ ITFRATE: 00 K-OB TO U2-1B; TMP(K»-EXP(I,K»MP1; END; PUT OATAITMPl; /* HTFXP •/ STMT LEVFL NFST 105 301 302 303 304 307 309 310 311 312 313 314 315 316 317 319 320 321 322 323 324 325 32ft 327 320 330 331 332 333 335 336 337 338 339 340 343 344 345 346 347 350 351 DVEVAL1 : CASFA: CASE*: DVEVAL2 : CASFC: 0VEVAL3 END DVEV M* IB * call'findbiticls(o,oi,clsio.ii,clsio,2)I; NHIO|=w2»CLSI0,2»*CLSI0,ll; IF TFRMS>OB THEN IF NH101 >-EXPI I I .H HI THEN 00; IF DH THEN GO~TO DIVDIStV'' K-MOOIEXPI ILL IM» ,W2> ; PUT LIST! 'OVEVALIM; EXPIl,K+MPt-EXPII,K*MP)»EX2(EXPIIl,LlN)/W2U EXPI I.TLOCIU »— IB; EXP( I, TL0CIl)-l»«EXP ; IF P0INTB20 THEN FXPI I , J J I «P0 INT J ELSE FXPI! ,JJt-POINT-B20; J«JJ; _ _ GO TO fOCMPl; IF TERMS>OB THEN DOj II-ABS(EXP(I.TLOC(TERHSI)l; HT-CSGRPHTIII l| 00 K-TERMS-1B BY -IB TO IB; II-ABS(EXP( t.TLOCIKIII; IF CSGRPHTIMUHT THEN 00; N-TLOC(TERMSI; TLOC ( TERNS (-TLOC IK 1 ; TLOCIKI-NJ HT-EXPIII,LIM); END; FNO; 1 I ■ ABS I EXP I I . TLOC I TERMS ) ) I ; IF MP0B THEN GO TO AODll GO TO M0CMP2; FND; IF DP>OB THEN CKDVl: 00; N=EXPIDPtO); 00 K»W2*111B BY 108 TO N5 IF CSGRPHTI ABS(EXP(DP,K) I )>CSGRPHTI ABS(EXPI0P,K-19B1)» THEN 00; M«EXPII,K); EXPH ,K)-FXP( I .K-10RI ; EXP ( I.K-108I-M; END oroer; N»N-10B; IF N>W2*110B THEN GO TO ORDER; IF FXPIDPtO I«100B«-W2 THEN GO TO CK0V3; /* HTEXP */ STHT LEVFL KEST 107 423 3 1 425 3 2 426 3 2 42 7 3 ' 2 42« 3 2 429 3 1 431 3 2 432 3 2 434 3 2 435 3 3 436 3 4 439 3 3 43-10B; EXPlDP.EXP(f)P t O)l-SC| END? IF EXPT END; M-1BJ CALL FINOBITIN.K.C); HT»W2«C*K; IF EXP(OP,0»»100B^W2 THEN GO TO CKOV*J 1I>ABS(EXP(DP,EXP(DP,0)-1B)I ; IF CSG*PHTim<«HT THEN 00; GR-MOD(EXP( II,LIMI,W2I; EXP(0P,GR*llBJ-EXP — 1BJ GO TO CKDV3A; END; CALL COPVLINEIOP,HAN*lB,0Bt0B»; EXPIHANM S,K*UB»-EXP; IF CSGRPHT(H»N*10B»-CSGRPHT THEN DO; EXP(OP,K*llB|wFXPIDP,K*llBI-N: CALL FlNOntT(N,K,C» ; IF EXP(DP,0»«100B»W2 THFN GO TO CK0V6; IF CSGRPHTI I I )>W?*C*K THEN DO; EXP(OP,K*llB»»EXPinP,K*UB»*N; GO TO CKDV7; ENO; call creategpchan*1b,ub,w2,ht); exp(han»ib,ii>ad; exp(han*1b,w2+100b)-mu; expchan«-18,w2*101bi»-ii; exp(han*1b,w2«110b)-sc; fxp(han*18,0»»w2*110b; EXP(HAN*lB,LIM)-OB; HT*CSGRPHT(HAN*1B»; CALL COPYLINEIHAN*lB f II,0B,0BI; GO TO CKDVl; END; JJP-1B; IF MP>OR THEN DO; JJ«HP-10B; /* HTEXP */ 08 STMT LEVFL NEST 482 4 483 4 484 3 486 4 487 4 488 4 489 4 490 3 491 3 492 ? 3 49 3 3 3 494 3 3 496 3 4 49? 3 4 49 8 3 4 499 3 5 501 3 4 503 3 4 504 3 4 505 3 4 506 3 3 507 3 3 50 8 3 3 509 3 3 510 3 3 512 3 1 514 3 1 515 3 I 516 3 1 517 3 1 518 3 I 52C 3 2 521 3 2 522 3 2 523 3 3 524 3 3 52 5 ? 3 526 3 3 527 3 2 528 3 2 529 3 1 MDCMP2 530 3 1 531 3 I 532 3 1 533 3 I 534 3 535 3 536 3 53 7 .3 CALL COPYGRPSII.HANMOB, JJ.JJPI J END! IF TERMS>0B THEN 00; JJ-TL0CU> -1B> CALL C0PYGRPSII,HAN*10B, JJ.JJPll EXP(H»N*108,JJP-10BI-Mlj; END; EXPIHAN+10B, ll-AD; EXPIHANMOB, JJP1-SC; EXPtHAN*lOB,01«JJP: EXPIHAN^IOB.LIMI-OB; IF EXP(HAN+lBtL!M|>CSGP.PHT(HAN»10BI*MGHTm THEN CALL OVEVAL100B*W2 THEN GO TO CK0V4; EXP(DP,LIM)»EVALIDP,11BH CALL DVEVALIEXPIDP.LIMII; EXP( I,DVPT1-DV*B20; EXP(I,DVPT*1BI*-1B; IF TERMS*1B THEN DO; IF MP"SC; FND; ELSE CALL AODGRPSI 1 ,0P, J, I IB ) ; END; GO TO M0CMP1 ; SMC: EXPI I ,JPI»SC; FXPI I,C)»JP; IF TFRMS*OBEOP-*0l IF FXP( I , JI-AOIEXPl [ , JI«SU THEN CALL ADDGRPS I I . 1 1 J, API J ELSE CO TO ERROR: GO TO ASCMP; PUT EDITC ERROR AT ROW • , 1 II A( 141 , FI ? I I SKIP ; GO TO ASCMPl; 6?1 3 FNO CSGPPHT; /•BEGIN BEGIN BLOCK STMTS*/ /• E.G. WEIGHTS: ADD WT«SUft WT; MUL WT
I THEN PMSMIRI-IB; ARYAIR, CI-INSTNGIP); GO TO CLR; END; IF INSTNGIPI-MINUS THEN 00; IF C > 1 THEN PMSWIR1-1B; ARYAIR.CI-INSTNGIP); GO TO CLR; END;" IF INSTNGIPI-RIGHP THEN 00; ARYA IR.C l-SEC; C-BKCOURl; R-BKROW(R): GO TO CLR; END; IF INSTNGIP1 — SEC THEN 00; ARYAIR, Cl-INSTNGIP); GO TO CLR; END; ARYAIR, CI-SEC; INSTNGIPI-ZERO; GO TO ASEM; CLR: INSTNGI PI-ZERO; GO TO LPl: ASEM: P-3; R=l; C»l: LOC-3; INSTNGI31-PLUS; TEXT: IF ARYAIR, Cf-SEC THEN DO; DIVSWIR1-0B; MISWIR1-0B; IF R-l THEN DO; PMSWID-OB; INSTNGI P l-SEC; DO 1 = 1 TO P; STRNGI I ) = INSTNGI I I ; ENO; GO TO OASEM; ENO; IF PMSWIR) THEN DO; PMSWIR1-0B; INSTNGIPI-RIGHP; P-P*l; END; C=BKCOLIR»; R-BKROWIR); GO TO TFXT; END; IF MISWIRI THEN IF ARYAIR, Cl-PLUS THEN INSTNGIP l-NINUS: FLSE IF ARYAIR, O-MINUS THEN INSTNG I PI-PLUS; ELSE GO TO TRANS; ELSE IF 0IVSWIR1 THEN IF ARYAIR, CI-MULT THEN INSTNGIP l-OIV ; ELSE INSTNGIP1-MULT; ELSE GO TO TRANS; P«P«-1; LP?: IF ARYBIR.CI > ZERO THEN GO TO POS ; IF ARYAIR, Cl-MULT | ARY AIR , C )»0 I V THEN GO TO MO; FLSE IF ARYAIR, CI-EXPO THEN GO TO FXP; PM: T«C*1; IF ARYAIR, Tl-PLUS I ARY AIR , T»«M INUS I ARYAI R,T 1 -SEC THEN DO; R--ARYBIR.CI; C«ll IF PMSWIR) THEN DO: IF INSTNGIP-11-MINUS THEN DO; PMSWIRI-OR; MISWIR1-1B; P-P-l ; END; IF INSTNGIP-1 l«PLUS THEN DO; pmsmiri-ob: p-p-i; ENO; GO TO text: end: P-P-l: GO TO text; END; /• HTEXP */ STMT LEVFL NFST I 12 772 3 77* 3 777 3 1 779 3 1 780 3 1 781 3 785 3 786 3 788 3 790 3 791 3 79* 3 1 795 3 1 797 3 1 798 3 1 800 3 801 3 803 3 1 806 3 ? 809 3 1 810 3 1 812 3 ? 818 3 2 819 3 2 825 3 2 8?7 3 2 828 3 2 830 3 2 831 3 1 833 3 1 835 3 1 836 3 837 3 839 3 8*1 3 845 3 846 4 847 4 84P 4 854 4 855 4 856 4 859 4 1 86 3 4 864 4 867 4 1 870 4 I 874 4 1 875 4 1 C4SEM: ELSE IF ARVA!R,T)-EXPO THEN GO TO EXP; ELSE DO; R--ARYBIR.O; C-l; IF -PMSWIRI THEN P-P-lBl GO TO TRANSl _ END: EXPt R— ARVB4R.CI; C-l; PMSW1 G PMSHIRI THEN DO; INSTNG(P»«LEFTP; P-P*l; INSTNG(PI-PLUS; LOOP: END; IF ARVAIR.O-MINUS THEN DO: IF INSTNGI PI-MINUS THEN 00: INSTNGI PI -PLUS; LOOP; P«P*l; GO TO LP2; END; IF INSTNGIPI-PLUS THEN DO; INSTNGIPI-MINUS; LOOP; P*P*1; GO TO LP2; END; IF INSTNGI LOO -PLUS THEN INSTNGI LOO-MINUS: ELSE INSTNGILOCI.-PLUS: P»P»l; GO TO LP2; END; IF INSTNGIP1-PLUS I INSTNGI P (-MINUS THEN LOOP? P-P+l; GO TO LP2; fnd; instngipi-aryair f o ; if aryair.o-plus i arv air ,c»«=minus then loop; P-P*l; GO TO L»2; POS: INSTNGIPI-ARYBI R,CI ; P-P*l; OOl; GO TO TEXT; BEGIN; DCL II, J, OP) BIN FIXE0I15I; OfL S BIN FIXEDI15) EXTERNAL; P-2; 1=0; J-o; G0R0W--1; op-plus; S-0; PLl: P«P«-l; IF INSTNGIP) < OPANO THEN DO; ARRAY! I.JI-OP; J»J*1; ARRAY! I,JI-INSTNG(P| : J-J-H ; GO TO PLl; END; IF INSTNGIPI-LEFTP THEN DO; ARRAYII ,J)«OP; J-J*l; ARRAY! I.Jt-GOROW; T-I ; I--GOROW; bkpowi i+il-T; bkcoli 1*1 »»J*l; J«0; op-plus; GOROW=GOROH-l; s=s*l ; GO to PLl: end; /* HTEXP */ STMT LEVEL NEST 878 4 IF INSTNG(P)»RIGHP THEN 87Q 4 no; ARRAYII,J1«SEC; J-BKCOLM + 1); 88? 4 1 l«BKROW(I«-l); GO TO PL 1 5 END; 885 4 IF INSTNG1 PI — SEC IHgN. 886 4 DO: OP-INSTNGtP); GO TO PL1; END; 890 4 APRAVd, JI-SEC; 891 4 FNO DASEM; 89? 3 END STAND; 891 ? PUT EDIT! «HAN« • .HANI I A, F( 5 » I ; 894 2 DO I«0 BY 1 TO 19: _ _ 895 ? 1 PUT SKIP EOIT(I,ARRAVU,0>,( ARRAY(I.J) DO"j-f BV 1 WHILE! ARRAY! I,Jl-«0 E ARRAY (I, J) — SEC M ,SEC) ( 1 20 I F ( 4 M I ; 896 ? 1 END: 897 2 DC l»HAN BY -IB TO OB: 898 ? I DP J«OB BY IB WHILE ( ARR AY( I f J I-.-SEC » { 899 ? 2 EXPIR19, J)»ARRAY( I,J» ; END; 901 2 1 FXPIB19, JI=SEC: 90? 2 1 EXP( I.LIMI-OB: 903 2 1 J'OB; 904 ? 1 JP=1B; 905 2 1 NXTOP: EXP< I ,JP1-EXP(B19,J»-E0U: 906 ? 1 IF FXPIB19, JI — SEC THEN 00; 908 2 2 J'J+IB; 909 2 2 JP«JP*1B; 910 ? 7 IF EXP(B19tJX0B THEN EXPI I , JPt-EXPI B19, J ) ; 912 2 2 ELSE 00; 913 ? 3 EXPII,JP)»WGHT(100B»: 914 2 3 EXPI I, JP*1B)»10B: 915 2 3 00 JP»JP*10B BY IB TO EXPI I , JP )*JP; 916 2 4 EXP(I,JP)-OB; END: 918 2 3 EXPI 1 ,JP)»wr,HT(100BI; 919 2 3 FNO; 920 2 2 J*J+1B; 921 2 ? JPxJP*lR; 92? 2 2 GO TO NXTOP; 923 2 ? ENO: 924 2 1 FXPU.OBI-JP; END; 926 2 MHT=CSGRPHTIOB); 927 ? FXP(0B,LIM-1B1«0B; 928 2 FXP(0B,LIM-1CB»-0R; 929 2 PUT EDIT!" HFIOHT = • ,MHT I I A ( 9» , F (3 )> SK I P; 930 2 DO 1*0 BY 1 TO HAN; 931 2 1 PUT EOITII t IFXP( !,J) DO J» 1 BY I TO EXP(I,0B»), 932 2 1 (EXP(I.J) DO J»LIM-2 TO L IM ) |( FI 3 » , X 1 2 1 , ( 160JF ( 51 1 SKIP: END; 933 ? FND: 934 1 FND HTEXP; 14 z ■* Z » I I eo V ci _j ♦ »- in => r» — ft- OC *• uj • z it UJ 00 ► IT IT .— — .O 5 * UJ — X -o u. ■* • -o iv m )- oe » z < oo < Z «> ft- — IT vi m\» ft- u. or — m m • o -r jv ir • m at* it n < cc — - Z » — fg ~iir O i« CO- UJ ft- -00 X - O » —oo uj ir u. o Z » • P- U'OD > » M. a it • K -I IT 4 ff m U. 4- • CD >• » or m • 00 o o Uj IT z • o o — 0" < ft -I ft 00 O -0 ft- — <»• z ft- * LU CO < it i or X <\J UJ U"N C — ir • •00 •Ift- olir i — « «0 i— l H- » «0 ft- M 3 * eo o >c o -o o o o in m it in * — ■C ft- '*• « ! I I I ! ° i© m ir If » • if- |Z Z — an ° o 1 >m !«-> u (uj - -i _i \* * U! UJ O IX 00 00 — wrf ■« < -1 ft -J _l < * | • in 'H- K O -* Z Z '••ft • l u UJ |hK| X X l« |co '• * ft- Jv| 15 •c ^- • * «D r- , c\ m eo irv « f>- «« »u m: 0D IT ■r o O « a m • f- •* •I- IT IT.' ^ r>- 00 » » ^ o o V * r» P4 1 o o r- * *■ UJ IT IT. UJ * r- oc ' K ~< K H «« ftft o 0- IC r- o u. O D u. ft •0 •* «! ft • UJ UJ ft u-\ •e *- 00 If > X X >■ —J » » ft ^ or a. r- in ►- •9 IL u. « o « ■+ m a Z ■ Z UJ •e ►- ec UJ V > ftai X » * • X • 00 -C Of or 03: •e m «i 1 ■* « < ft li- •O -T CV li- i uo _j ir. z Z _)' ft <0 K ■ ft UJ « «; >- *i > »- 1 h- ►- ^» ►- K ►- K >- H u «M OP as _l CD rvj ftft . or IVJ 00 o or Z 1 z z Z z Z z Z Z Z z >- o* « 00 ft 00 UJ ft* »•* O CD o OS ft* i z •a r- ac z *- ►- t- i ft> h- K •- H K ft- a Z UJ ft UJ Zj KO l/l «J1 l/> •n wn <"i UJ r- Z -- z o 03 ^ * O: ■3 z z z z Z Z X Z ;z z u. ft •o o r- o f- *i » in m O » o o o o ° 1 o o O o o UJ o m zj c oe ° o C ft <\l ft <\J 191 •is irvo t»l O -C uj ! UJ UJ i UJ UJ | UJ UJ UJ UJ UJI z ft* m O r» o r- — I ,ftft ■o r- ►-■ ftft * eo i 00 03 03 00 ! 03 00 09 109 09' < -1 » • ■ _i -1 » • *' -1 * < < < < < « •1 « :< «' u-l ft .0 « eo < » • • -0 f- r- 1 o ■r K ►- >- *- K ►- ,K ►-, |l- ►- K ft D ■i o » ftftj * ftft * ft » ^- » z z Z 1 Z Z z z z Z Z O t- 1^ K * ft- IT rj-'o ft- O CO 00 t- s UJ 1 UJ UJ UJ h> UJ UJ UJ UJ UJ UJ or < — 3 <0 3 -0 « O < ir o oo 4 X X X X o> X X X X X X H X C\ < -0 « >0 X ►- h- K t~ » K ►- ►- ¥• K ¥■ >- >- -* o m o m f- 00 ►- M «n ir ►- a « — « * « » « -0 « J3 < ■# « IM * -< « ►- •«' »- * k ►- ►- t- o K »M K IT ►- — K * < < ^J — -c — •c « '-< «J •c r- r- « ■r u) m l/) (O crt in «/) m Irt * t/i m «» * >«• ul IT t« * I 16 1 2 1 1 — cn •- ■» | o • o ■ • cn UJ * i Ul o i K r~ •- IT * u- O cn » t* c IT U> t« -| IT U% ■D 1 IT ■# or » » — < —* x« M » 1 » » « CM m w — ~- •• «* *x -Ji c*v cn ■c i z o ■»• O o O 001 a o «! 1 O CM IT »« IT m UJ Ul UJ —l UJ UJ — . IT IT -♦• CC * • X X X —1 M X t- » p * • ru m — ■* *" ^ — - \ w UJ r- UJ O i/> iUJ cn pn » KN * m ' M » •" -J in !_j -J « -J u> * 1 til P- 1- p- !z > > > • V -ll >-' >" - C5 • ;u O • o • » \w- 2 Z z Z 4) 19 on o at K IT a: - m « r- Oi z ■r- ;z z ir Z P>- O K- Z < < « a •- P~ m < « O < Ol m « M UJ -•ncS "• IT "• ^J CM « UJ p- p- i- «/> ♦ t/> * tjRl ► l< » 1 o - ► - . — vo UJ Z z z — • CO •c 00 1 00 O OS CO r- H>'«\ lo »- m «- IT MJ ^4 IS u. o o O o- iyr"n » 1 . o >«• •■ ■ M ■Ic «• « « tr < CM O «M lO UJ o o u <« I— IN cn o OH o o o o •* « O tr ° O * O* UJ « |u a. !•" » UJ j UJ » ui H UJ UJ ■l •i -J • _i -J » -J • X ► _J i-j -1 <*■ « CM .o z 1 Z 00 * H Z; w Z IT all !_j o LU UJ UJ CO Is r~ pa o O CO O «N O -0 o x< « ► as i • -o » cr u. •J 'UJ z 09 !■ CO « lO eg ci kH 1 *- —t IM Ol t— IM MM x* z -j -r ,-j -J cn -J cn « -r i* o < < < » •" * m -J -J, - - 1 H -J » -J ■ et « » !< < • « • J> « ,« -J -j _j r- P "" IA < *• «'C0 « H < m < r~ UJ X CM !X X (0 X — ee -t> i_J t/> r- !« r- ro * ir ► p*- • f«V ► *M CM UJ UJ, • Ui UJ • UJ • H p IZ 3 UJ UJ uj o • i ■— u-* >- "i l- -0 :SJ - O c* OO o mi O 0> o * ■ m lut eo X I X K O r- rn UJ in < P» m Ul « ♦- u * »-4 • • * »f" • o c* p~ ,X M UJ UJ UJ -o » PM m X N X •" X ©' X . If! > CM > «M >• cn Uj a t- K P- » f\J » » « J o ► O «' < » o » 1- OC » Of M ec - ec - or p K 1- < — < IT < ec u^ oc eo »- IT K - oc OD K « • H tn H- -c p- * *~ OD t- C f- ir O P- 1" « #1 3 r- 3 ^)i < •r 30 h» z -« z C- Z CM Z m Z >* UJ IA UJ -. UJ «< UJ CM t« ^ IP) h- UJ a a a z 19 »- or or ptf UJ X < 19 (9 _J p- 0. X X > V > < ec UJ i/i '« o U o u U u u u <_> u u U O « * • • * • * • * • • • • • • • # # « • « • * • * * * * # * • * • * • ' * * * * • * • # # # # # • • * # * * * * • 1 • * * * • • • • 00 • 0> • * * • « i i * CM' L o * « er >»•, o- ■cj" * |* ec iO 0» .* CM CM -* CO «M 00 "* *: \ cn * IT i" 117 o o » * mm «n O en UJ (•> X » — o u_ en «/) • en UJ > • o oc e- Z < •• ff- uj a n H- K u. » 2 3 t- -c uj CO U* *- X k> X CM UJ ec « » h- k- ot f- « » »- < ^s ►- cm < a. cm i/> cm i/> m a CM * <»■ * in CB CM * ■* •* IT CB O en K • N* IT •» * mm m » — -«'f- O IM ■* UJ * -# •I » • ©!*■ IM X IT CB IT IT «J» » * f- m IT • • IM I*. in If » » ir co mm K ir in •* CO -I !«■ m m * «4 -« r- W IT (M (D — JJJ m in ff- CB o * in in CB N o * in in o c m m m m cr o m m m • » •c m * in a. < Men * m « — t- o (/I CM I/) 1^ CM ■•CO O ' ►- © CO . 1/1 en — I ■ in, en mj* cmcm mien CMI* »i • CI,-* f> en «o in 4- m o •* ■c m »-| » o in cb « • m • om a: ► to z a IT < CM UJ 4- r- »- t~ >■ K * Z Z Z ac ! Z| o <, < « < z ►- z i/)' i/t t/1 «a 3 z; Z z CO , z ° a o • i o \mi i u u lUJ i u h. j i J i -1 is | -1 _i UJ| Uj I UJ UJ u CD 1 CB co 1*" CO z <'' < < -J « i« -J: mi -I o mi i™ H ¥- 5 ,u m e- k Z z ■- M Z UJ UJ UJ O UJ iK r- uj!h- • X 1 Z in X « —< X CM O UJ uj en UJ X m uj m »-, K ► K o * k! « K < « * o << a h- -o < > >j > >■ l- i/i >• > > > UJ UJ UJ UJ o. IM IM »-. mm •-< ft. >• > > > >. > o o a • * • • « # • * a o • « » • # * : i • a! *l * • • ♦ i • * *: a o 1 a Qi 1 i | D ft ft ft ft ft ft ft ft ft a e>; o e- m i r~ *l m r»l in o *■ en 9- * <* o>! ■0 ° en •*\ -t CM m ** CM •o !•* -i CM -1 iin en \m* IM < CM » I- •c z IM UJ -J o ► o» a. m X — x -*! UJ ■£, I- ► < £i I- c 00 IT O ft" in o -* • ft in * — > * — o o — UJ (SI X ► • CM « o Z IM (B ft" » » - «0 O «> IM ■C\" -f o — o o o "- ft. •••■•(••(••(••» " ft-m o r> 'Co ft* C •* ce «r i m o CO — *J — -f'-C © « o» © • MNNin i*'* *!* * «n urvc *- •' ft •»•!••••,••■ • - r- o e> — >o — * 3 o« co -r-i* ifl CM * in fil«" « O f» O MIM »" ►. o fi -< « -< ft « o •* c!o ft .-cm (Mm mi* C* tr m injc c .c im ■#■ f. sic ft • cm C -r — — ►- o m — »r> o i"«oir»oiii — im im m inU -# .» it m ft* c » IT O -C IM C0|-< IM CD »««\»M C IT CJ) C fl I r- O »M M ■# O «K\ i «o n — -O IT. ft" f-;0 C •* •*■ m •* #|0 fM m C IT CO IT CO • _ r- c cm -4 •- fM IM » »- — CM r^ fi * ♦ •* * IT m 43 C .0 o » » l«04« ■4 evo r- * fl * C lf> f- -0 pri r- C CM — t <*.0 N^»«\N "< fM IT fM •O C «0 •» _ fllC f» O CM ft C * * in i-4 -O 0» IM ft" *|C NUMjIflr- ft* fM • « — ft" fM m fllf * * + lMf *J 1° — !o UJ ,X C •» c c * « in cm a m ft ►!.- in C o >-' » — — fl c -o « fM O fl'C fM f> fllf - f" a. -r < « Z (M ftft fM CO "i z - ui e\ z ir ir -* O fl fM fl fl f- * CO O fl »'-< -" C CMICO CM CM m ft C CM fl O C CM CM CM fl CO -< CM CM O C CMll> fM CM ft'fl cmI»- i CM - »* •♦ * * m . • » » » > in'f- 0'in -« 4 f- »« i <#'* * in •'ft * • * -A CI* "*' * #ftt # •* * in •c fi jj f» -< ft< IfJJJ- m — in r-1- - m*-o c ». » •; fti'C * f-,o - inj< c *io — rn.cc ec,< — JJ:0 •- m -c o- ft! ft ft 03 -0 — >o!o - mi« c H ft ft f-im o •e o - U-, JJ (J. Z • UJ f « CO IA P>- >. c — >■• ft tt c « tn O "^ UI • O fl ft« CM I— — -I ft -J ft* I uj ft ^; ft j"» ft- I . IMi Ift- f» It- ft I- p- x— x — K w O » O—' < K, »- m iP » tw #■ 3 K '3 f>» » t- IX at ft. Z c o _1 -J a UJ at or at «s < CM a X Q a ' a. at > > X X UJ |M UJ UJ UJ UJ UJ UJ UJ K X # * « * * * •» * * ' * V. a z -j o- in c * # * • * * * * « * * • * » • L) CO * ft" eo a « fi O in ^« « CO o- ft ,CM * * » '• 19 h- 1 4> ; p | ■ •« • ■>> 1 ft" <• * 1 i •o » ft, i ft IM O "4» ■ 1 l« ■0 «n ft- J •m; •* e •«■ i 1 «!. » • ft I t ft 1 »n r- f- «< i> o> -O i ^ft * oo * | •0 • • ft *m ■r o i 0>l r. e> * I o- © •* 00 * 1 oc *■ • • • j ft PK © IM i k * 1 CD © * m <» i 00 1 «B » » * i ft j «M ■- ir\ M 1 K K lf> rr, | OC O JJ- U> ■* 00 ) nO MM ft ft * o — — IM M" (^ •* in IT CD in. in -c in V ► • ft* •* m>\ •" — * » r» ft* — f- ^ et ao o» e* M ft» o ** j ^y in w O <© a o ■* o <«» O 00 o O r- O' Q O- a UJ CD Ul Ul •*• Ul z <«■ * UJ -4 x "0 ".Set •C o> o UJ UJ UJ U. IT u. u. •*• U. u. ft- U. Ul u. «o u,, u. 6 u. IS> •J _i _J • 00 •1 « * o •*.+ ft tr\ • i'-J » ao ft! •e UJ 19 O- o h-i > » Jvi >■ • Jftj UJ ft • > » » 49 >- » >' IV » V VJ Z z z Z ac ao ec a ei or z -c * oc m ec 'z ■ ao oc 0C 06 9- at o> z »rt « < .0 •*i < IM « «»» < ft- < (MM « •c <\ < o» — « in UJ Id l/> to ►-' Z <« Z' Z M Z[ "-•«••«■ Z m z iiA z OS z; Z m « z - Of «/> -J ft ft » ft" fti ft UJ K ft-: ft- Z| ec oo oo! eo m 00 < *'f\ ao 0 ft ft 0> « ft in UJ O o: a o o «■ •- « M « CO ix « 0» < K < tn'tn «© 1/1 ft- l«1 MM MM • «0 ft ■ • *» ft «M ft * • m ft v" ft a — • in UJ u * u u h- o -o <-> O ~* Ol o -r N, ac ft l«J o CD ojr- ac mi *m\ mm an o> ft" » Ul f- ^ M> ft •— 1 •> UJ » « UJ ft Z> O (V O 0> o o UJ 1- ft< ft- K * l-i I- * o k -e i» «» 1° V *- i- r- 1- •" UJ ■*■ ft* > m >- n > •o Ul x «o X x — x : CD X •#■ * X :>• X CD x!»» X in ac o ft o, * 4 »' » « ft t- i- — t- * ft- m < IM 1- «M t- - 1- » K!<0 ac IM <4> ac r» ft- Z f- z UJ rr. *» c»fM ►- ' • 00 CI o> no 43 (SI r-1 «■> in it f> 43 C -* r*- CI CI 0- IM 43 fM m e IT. IT ft • ft 9 m ft* — * ci ct et (SI -c fM m r^ IT IT * IT c »- -* ci m CD _4 * (Si pi 1" IT in * (SI OB — in m (SI CI e in IT •—I o ■c e* -1- r«- in © r- o o o cr (Si OD OD IT » m • » ft m n (^ * in IT —» IT m in _■ *> » » • « p*. n ft* m4 ft* ** ft » » ft » V •>. >- >- ft" a OWN •*■ in a. _4 or ac oc < (f (V 43 — •*■ < r»! «* in 3 3 I «* ci ci'«r in — 43 -. .UJ .3 O — ■ PI CI * BMfi Ul • > ft z -a * IT o f~ — 43 — ,♦ OC H-nri 4iMn in o» 4)- OD 0D Ol OS » • mi > o Oft*; o o Ul »: Ul ft* Z » Z • O fM O 4T — -*l — 0» ft. 001 — »- in < r» X -« Q » ►- -* i f-1 «-ll-« -* >0|O i«> -n;*» m od'o ft* in c pi «v*t m m © -* mo f\ o * ! Z ! I z 03 i ft o Ul I 'z O 0D ft. «| ift? 0,<£ ui in u in a* in i- i j ! o e> o o* ft-o in OD >* oo o o- Ul OD U. 0> > • • ■he —I I* 0> oa. in -« »o> m O «D 6-1 Ul' ft ft 1 z in - o e> l»>: -- oo a r«**i o * ;mn 182 if i-* nj %i , » » nj in o> — -ft fM • ft ■•* -» e» -* ft* M -* •* e> •* *-l fM o • o . 00 -* m — fM ft*; ft ft OH O Ull>* fM XI » • ft».*> 00 U. 00 o «.» fM >\ • * 0. fl •>■ » < 00 OlWI Zlftft MJM go \p\ m'ftS vie on Q|_. (SI (M Ul • ft ft S'O mm Ifft O fM ft. •* fM fM mi 5* 2^ o o CI •# o * Pl-t o ml*j •o.-)- m ci * — ci * o o- 1 * Ul (SI ■* x •; • ft. ci'ci u. o- -r ftfM * SfC «»o of . 0> 0> O fM CI Ul ft! • Z-I-* o clo- ft. fMlCI o o ft m m »ft, ■ft* o o Ul Ul X X z o -J £1 ft-O o> OP m in o pi -t 00 m m 43 Ul fM Ul ft X -* X o < •: *I V 0£ ft*' oc » < is < OC a. ft* a. oo « 00; » OC » 0C Ul fM Ul 00| K; Ul Ul : -*l X * < 001 « ►- ot »| at ft < »; *i O o. *- : a. ►- ■1 ui m< I- *3 1 •%4 Soi 0L 43' »iin ot » ui 0> K 4> Ull • I'O <'4J- sc » < 43 o- -r or Ul ft*l ft- *l ui •( 'X © * *>! oc ■! « ao! a. c • 0> CI uef ft. » •> »*• in o «.ff ci ISC O! • • 3 C> fM « as C* ft eoieo o fM \pi . 43 O IN or — fM fM Ul' ft ft ft ►"jf- 4j- -r >- in -* Ul 43 O fM •* OD 00 X — ■ fM fM X fM'CI .< - ► .» O ►' • OClCI fM CI ►- ©,43 < 43 O fM 0. — fM fM 3 eo|r- < fMif" O 4} *- 4J ! < CI X 43 gin* 3 Ci < 43 J 1 43 ft- ft »-•-> ««!C r -o Oi ► ft- 43 3 Ci *X.43 X * •li- * ft * * * • » • # o z -J m u -r lO *4 • • * • • * ' • » * • * * * # • • « * • * * * « * • « * # * • • • • * • • • * • * * • • • * « • * • • * * * * * * * * • • # » • » • • * # • • # • (M -- fM a o ft* ft* l>~ f> m 00 00 r- 43 *l C CI « ?! *: 43 • I ft. 5 « «-l 1 m j f- CO co ^ | ! to 1 ►- o i j m* r- mm <\J OC •«• i f» 00 1 0D ir> to m or o ir mm mm 1 OD -«■ ! . f- 00 CO IT i 1 r- CI (SJ -^ o -P f- * r» 03 00 m pp m m top ^-t © © o prf f» •# J P ►- «- 1 00 IA • m » * p to 0> »3 K K 0* .0 pp o- O 00 ^5 ir\ f> o UJ * p K m >» m -0 CO mm o pp a o 00 •c IT m •#■ u. >c 00 » 1 CO ••* in — < to V in r- oc © o © o o o o O 00 o o © © © o © O a 9 cr f- to • « » m ■x> p to m to 1*1 to ■♦ p « -o f~ CO IT in ir> 'IT \r\ a> li-> IT -^ ir\ m m mm m u\ m m z ■ p to •-J pp ^^ P<4 mM m pp mm to pp to M p (-4 top pp ■c to^ -0 .r rr PP ,p> ^ 'p> w IT pp mm 00 pp tr top © pp pp pp in pp O 0- o •* © o o © >0 ►- OD UJ UJ UJ UJ UJ CO UJ UJ mm UJ m UJ --t UJ UJ UJ m ID O • m p X X K X X p X X to X to X to X X X m *t UJ r-> m Cf ■— u-» p- ptt ■ m* o top top 00 H 03 o» pp 09 pp top pp p mm Z eo a m U- U. u. u. u. •0 LL u. IT u. CO S3 u. m u_ u. u. m u. 1/1 13 <6 k eo p p p , p ' m 00 m p »•* p (TV IT p mm p » » m to UJ pp p M • >• J> > ,>- K \> p V > to >- to *] V p > J> > p >- u -1 0- ir.| z p- 1* p. z z p Z a _J l/l r* p. to" m UJ •J •c M — CD ec OB lee z jaD a co in » ••> 00 00 CO eejm CD to ■a p ■i » m to ■« ►"! to en » © •> ; » UJ a (9 prf O u> O <« f^| O'Ci * e|»* !° n o! z UJ *o ►- 00 M — * !•* r- »-* CO pm CO »■ >o top PH top n ".: pp m4 1— 0> top *- m < p p » p -J p mi m _i _) < :_i p -J to mi P ml p _l P _l to -J to '_j p -1: U O o •— i < o '< •* < ^> •t ^ > _J « 00 «t CI * ©' « m m •X -t •I fl •t CD o •c or -H o «i m o mm or to ec to or ac h- < m » • u. p 'UJ •* UJ 0> UJ ir\ Z p mm to UJ P to- to to- to- p UJ o UJ UJ — 3 M K ■c o — 1 »- (T '►- r- ►- * ■ >- ■t UJ r-- IP r» K c\ K 0- ►- oo ►-! ►- f- t- o ^ r- ! >- •♦■ to- •*■ 00 in Ifl m m ri UJ •t UJ m UJ p UJ m X n *t u-v < © UJ * < IT >0 •t rj ui to UJ p UJ to UJ • pp m pp •■ H * e ■* m f- or ►- oc m oc p ie< p < IT ►- © h- mm or r- h- in <»: K IT or • ec p or p or » i- « IM m r<\ im IM IT » •e * O * o IM pH « o- IM M m • m 00 Cr H c a* O •* » «M rg m c» — im * » — — — im •* zt — cf UJ CD or co Z CO — .«• m O — «M uj r\j in X • » — >r * U. mt «M » IM in > » » at m p- d -i co Z rvj * o o » im >■ • ec m < o itTllM — o o ■ ► a. u. e>;r» »««m > «J • a k m < cd r- in u. * • ■*■ Z m IM IM m 0> r M » »o o m • m o- m im in IM in ** IT m m m m m o> in m m im •» <•> in in — o o — UJ . !S8 CO O iuj m ;z o !° — . -I o ,' '« ^ r -r o e» t- » CD CD — a co m o cc Cr; k. m — < o- m Iff o> O •■ * t~ m o 3 a n. «I (T 9> 00 CD ^ * od co o — •*■ uj ► » Z ec im IS OD CC M -. * _l ► » < r- — - cd ar o — * I- -J CO < CO «0 T — l»i C ► « ►- -» r» 3 0D >C < -- »^ OD'-f o>nj H-IMN r im m«4 — m o m < o < o 0* oo » r-.l » m ft in. o H or ► • » _ rJ, a. a o< .* k e> i- * 0C 0D < fMi UJ • mm UJ . z!» o o ►» 0> •J * « 0> «o O 0> < o z ,0> Oi • 3 O UJ - r « oo ►0>i u in; M - I- 09| < 9' z mi O «j 3 O < m~ co op oc r- »m » r- ON O — UJ « UJ » 5* z 0|IM ► r- im o ^lm "" «i » >- m.— < *- CM Z -ll« O M . I- « - 3 M-* < — Im uj m r m ►4 O o m 00 -r ► ,(»! ► ► in ► m M m -4 t~ IT m - MM C If 00 * Q -Mifn « M • UJ o •* r~ mM X -- O MM (Ti (M m in in m in 0B • ao m in OO m «m m — m4 MM mM MM 1 ft V.IM - MM M MM ft MM _l Of « J- — <•• o o o a o ' < ««- o r\j O -0 a UJ LL UJ UJ UJ MM Z o m UJ oo; UJ - > V V. V 1 ft <|- . >■ >; K K >• ft >■ K l_> Of of at of DC o — ! ► m ec ►- 1 z Z O a. tn ec z z < < •» < < UJ l-iin C* •« Ol •t •X * < in < « •t u. z z Z Z z j z — ie> * z oo ►- ►- oo Z r» Z MM ft> QC 1 C9 Z|— ► VI ft - > t/v U. 09 CD £0 OS' 00 1 MM — »m |0D M z z * 09 in las > z LL ft » • * ft -1 ..(M r- ' ► ON o O CI ► * > . mJ o UJ c O K o o o i < o 0> •* o r-l u O 00 O PJ o — u or UJ UJ K - UJ UJ UJ I M uj U . 'UJ M. ► UJ ft UJ ft* z Z » z Z z _l _l _i _i Z -l^i PMt z » 0> o r- h-' it- 00 0|»M <_> o F-l t- MM •C — •* M- P- MM ; MM MM UJ j * UJ M UJ UJ ft MM |MM m M MM m z Z * MM ft MM MM Z! 3 »- (C ►-,*- y- * Of CP of m Of "1 < o H •-'. r^ ^M nii •» PJ « CO K IM - ft < <0 K 3 in 3 i»i 3 ■* 3 O 3 c < ■o < o < « K -c 3 ! ^ a N 3 i«i 1- o >. 6> 3 ►- 3 00 ►- * < 4 00 < P- < -c V0 IT — mm • *>: 01 O m «SI o o • as • ^ ft m • IA l»l| g- t — go « *l «• — ro — CO fV ^ g> O CD mm » •' o ID • eo * l«l Ui (r mm X o ft «*> * X in o — rg mm (M CI mM mt ft U. 00 —* » • u. i lf\ ft ft mm ■» -o ft O OS — >• — 19 m tn V 00 in w a — Z rg in PC rg gj- o < CO — mm « •' mm •t ID z • ac O ■£ -C o z O «o °.< o m X h a f- » FV • r- » rgi ft- 00 O i/i \r — !«• 1/1 cc in rg \K * on rg f» a. . . CO » mm — ' ft; ftg » — . •* -* ft' «■. » -J ► •J mm ftft ^ •ft CO ~ o- — -4- OB' fr- < -0 < * a o- <*i]m c O r- o m a rg • ee .«■<► — * Ui _• MVI » M » ID ft* uj rg ui -r ui rg. «» < 'f- r- f- ••• X X f-' X ft X ft x J ftg Z 1 » M O ml>- ftft — 0B ►• o» — in' 'Z ■*• Z (VJ u. o mi* u. IT z u. r- u. - u. rg O 1/1 CD — er •— |z (»> • — «n IT . ft K ftftl » rg ft * • IM. Z Ui «* 'f- f- ftft •k ! »r- • a * >■ » - • > ir\ ft- : ft > • J» ft >• •< £ O z z O a O • o »■ » a. o o- gf n »: Z| o 0C vO a. co ac tn. ■c •* •t » ID |UI *' Ui ■m- •-< « o> r«i ■«• ■ * •* « ID « r» « -• « rg i- ID m h- — m Z !z m z iU M z mi m IT z gy f— z. Z rg x * Zrg i/i a. *g 1/1 l/> ^.i o .IS r- o; s ro ^- * « » ,MM IT VII Mft ft — » *• ►! id ■v z ci z w — ' •— < ^ — « 03 r- 0> rj IE • Z: - eo rg eo eo •9 -r, o u. c O CO o O _J , -J o _l 111. 0- p» rg -r ft) K> Oi _i .p» . -. . o Ul in ai ,0 T u ID « ;< ITli < o o — in l/N O CO *->! "5 o rg o ••*■ O rg. Ui mi z a. X p X ft ! »r-. z Z fl ID » • ft III * J Ui • id: » 19 ft- ,_l «0 _l »— mi ft. .: 31 — s z ■C fl CD z ft -j' z — z o Z rgl o u. ID * ID ID 1 H * M 1 1 -c Oh- (M r-i o o- ai < IT. C r- Oft* O O -I ' z ft eo in S3 • z zm *-> if- » kM ^4 ffi IT »« * ei 2 * ft> rg ft- •* -Ml * « > * • < > a a f- *" !-; « IT !3«m: » r- rg ri » a K- © • r- ft o •CI » -* ui Z ;ft> IT f- Z X !x ov z r- a) im o mm m| IT. u CO f- X * a rg o + O- O M f- ft* ;z - z — ID 'id r», oi • ' m rg *— ft ftj-ft ftft c Z| m .ft. Ul ► ■M' ft i~r,1 ft- » 3 CD ■£> ID (M ID CO • o i • • K l - |U •• ►- it rg in •f- ft Ui ft ft f- ** f- o f-|in CD »» Ix 1" z ft O rg u c- d!<« "— iM < r- rg rg 1* *i X 1 U) OD Ul r- < ce < c < a> _~ >■ IT ID IT Ui >- — o- Ift-r- «r> \tc a X — f< m . m\ ft HB.IO X 49 ID •- CI X rg SXifl X - X - a oc » If- • f- a f- » If- JO ^ » Ui mm o o e> f- 1 f- r- ae cr O ft O ft Bo f- f- r- .< 0> <« «"\ f- r- < •* < ft o'~ 'z m- f- f- » t\ « * 1- CO f- 0? — Z « f- g9 f- O z o- f- (M f- -IN 3 r- f-l ^ r- «« « 3r- 3 gs 309 < ID ft* 1/1 r-l 1/1 C ID d 1/1 c •1/1 ! r- h- •!• 1*1 00 pp — IT CB'fM p 4 O <0 -«'•*> . r- CI r- r>- od:oo ! «\j p ■ • >! » 00 p4 » n* in •-< p * in O •o r- oc oe (Si p p p p p 00 o in o inia to -f i o 4) — i'i«i 00 m «JD r- OOlCD P-J 00 c*i a «c ^ m p c« •to — m ei o * UJ p 00 o M X mm 00 * -* -1 x r- • «» — m to* d * h- 00 00 >> tr — f- o » u. w p p » p U. fto •to p • CM • X 0«N|."1 P to -• pn m cm >■ •— ►- -r o *" ,>■ m 15 00 — m a IX ^ P. CD OD 'oc o> z r- w • MM mm < •> •to • p p p < r^ to- p pp o — o o z >- o -o o o r- oo oc 00 00 nO •/) ►- in _ *. to. ■4 l* Z •to p p p p p p h- to p pp u. o to. toy -J to* to* eg o> in • CM a B '•* CQ O f*- r** O'*^ 00 ,•« O « o O 00 V 1*1 UJ UJ to* *> UJ O P- 00 00 00 — in pp •>• uj in at * X X K Q X p p p p p k r- p- p X p < o to. ci to* ' to* UJ to* o in m,-< 00 1 to. P pp 00 -r- z cm u. u. ,z Z u. ■O fl 0;»>1 r> z in Z in u. -o — I»l p CO p tPP o to •e p- oeloo 00 ■pp m — r» • o UJ cc » t- >■ l«n V . p p- >* p p p' p p p- p»- p- to p > » k o • r- z Of ,_J or iO z -J at o < UJ o ► to- p u. at O in z a r- !m OD tto* Z < 0D -JJ" m (MOO 0" z to. P z p« in eo -r z u. — o o • oo '»- • !-> O z p in esj o»:»\i m o 1-""' o J *■! pi r- mN z - UJ P ► o _i z C" MM z -j to. t- z m m 0300 in -J 00 -I p -J 3 "*i Z CM _. o o o UJ O f- P 5 < UJ X U IT (M M«m in UJ 00 < fSJ UJ » CM Ci* UJ z p- CI CO PM 00 .(_) to* z eo UJ kri •o r>- h- oo CO CO OC ■ ■z. r- 00 O M p-lfl oc < K • '< -1 p -J a «« -J p p p! p p < p ec _J k » to- p m f\j k esj m oo !p> 0> r- « •- • r UJ -o 1 •* •• » •« c o p- r\j if> rjx — * 00 < — * < p r- « o in ►- oo < * >- CJ f- ►- pB m ' — ! Ol <• >o * * » ft m 0" . • • ft n t— i ■* ■fl • ft ■ O .UJ 00 rs ir\ o *^ o- -^ IT o !« r- 'X ; •0 ' ft »» rv \t\ e X Bft — a X m mm .in * '» •e »• ►» OD — o o o — o> o O > mm mm ml > o mm' »n 00 ^ « r- r- CI ac mm ftft 0> oc 0> U. l«9 r- oo mm mb mm M ft • • ft « o O — ■* •1 oo » mm BB » » O o o O 0* n IM J- z UJ UJ O » z ft > o ■■0 o in -/ * ft • *N1P o- »- X X » mm — m ec '••> 00 < m in IT m .0 l«- r^ ft- CD ft- ft- m K 00 CD < in i«B m •> ft- ftft° mm —j .— , » * » B ft li- u. ft- 4- ft OD Z — 4 I • BC » ►- ^ mm bm .- « mm IV l«1 _l ft ft «- » -J BB «ft ■Ifl BB ftj ftft o o o Q < « > > o » < O 03. a M o r- Z UJ UJ UJ uj; >■' >■ >■ ft » * » ft • ft > -r B •ft K ft- BM| >- > » > » |0 u a at. oc OC •* bb o oo a a o 0C ft o ft z; z mil ec 0> ac o Z ■< < 4> r« f- ►» z z z z » z mm i- 1 ft- »l Z I 1 " 1 z t~ l« a. bm .— tea. ft » » ft 19, o c ft- esi o t~ to l/l _!■ «« ► ' — » 1 . w cc cc CO Hie* o O ec ft* ft- ft- CO ft — z z o ■ « IO - m m 1 • ,«/> • (B »'►- ft - ft- r* «Ja> ft- n BM. ft m Of o w • ■c !u UJ OC » a - cc * o •« h- r^ ►- X CD X X °°J U M X a ft- K K u ^« uj m c i-j ■c Ih- UJ OJ UJ 4- ai it ft-! ft * » » UJ ft UJ UJ ftw ft4 UJ o z Z k r» ►- >0 i- o 00 in •< ft; * » ft in: H » ft •i UJ UJ ft-: h- » X — ©» UJ tB «SJ k * < cc UJ • UJ » UJ ft <'•* a * co u! tvj u u •ft •jir- u o ►- r^ K ft- ft- OS Z : «o Ujj » ft a ft [ft- ►- cc * oc » ac ft ft-'CM m in m* <. -0 < < mi < ft < .< O'K ►- » _jio m •- (M ;° K < ff < o < c 3'C1 0> .a- 00 K CO ft- K in 3* ft- * ft- ft- nj — o •« ftftlfM » 3 m IfM l^ «« ; r- < fSI U. -- OD < ■C l«ft X c K ft- Ui bft a 0. t~ X * • * * * « ♦ V. • * • • * ♦ » # # » • • * o z -J l_> o ft- ,o eg f- 0. O C5 X z Z o o u < oc M t_> UJ X. K K 3 ac i/i to to to to l/l to to * • * # • » ft * • * * • ft ft * • * * # ft ft » * » * # « ft * * « • » »• ft * • * • • ft ft ♦ * * » * ft ft » * * * • ft ft • * * • * ft ft > m0> i *H i *! 'ft 'ft 'ft ft .'• :• Irvj 127 U_ C U UJ z z uj 12 OT — UJ -J Cl o c r*. oo c in •* 2 u- ec ci m cr"r- — i ci •ci m o ** °° » O •- m CI m — m ■* oe oo UJ M-f X » » u.r- o • cm * SVi • • oe * cm -J •* UJ ■* CO OD — . CI © » in r» UJ r- X CI U. 00 • r- V ci a » < in Cm •as » • CM O h- UJ ci !? * 'O en ► CI m ci ec • O! o o 1 UJ ci z - O 00 c- cv _l IM. l_) UV «- cm! U. > — a — « » z a — o CO -■ o e- UJ o z — o « »~ r- -j o l_> in •- o O »ci o , o »* UJ » -J «f Z CD UJ CO O CI co r»: ~ «* > >« or or .. ■* « o z z o — — ,0 CD CD •» ► I »o ■o I e cm UJ UJ «M XX- CJ CO, O (M r» m o c> * in M. • • • \v CO Oi* * <\i » - »'m (M ,■•■« o cotisj r- r~:* * fM. • • •io» 0> IT I* -»■ cm; • « » © r- f- — -o CM* <«• fsi 1 » •> • CM e m m ws uj o X fM ci ■* m C» r>- 0» ci * m m U. 4- • O SV IMi • • «» 0£ » CI >0 CD < W3 in Z o> ci ■♦•m CO • • in o er> UJ -< ;z • lo f- — 00 «M * CI ci •# in -OKI «M *!fM ci *m o > 00 or — « 0" z • I- CI OB — »0» -J • < * «- e> »- » Z 00 Z'wr OKI -J O < CI ► CI (J . »«'0 o o- bO i— c UJ 3 i i- « • D — od ce co — \r\ sc or — • »- ••« »- o a < — m < o (/I O"- ci ►- m oo < in c r ■-« ci o. •■ » ►- m o < -- ci < a < cm m X — m O Ci m ir- » » 3 in o < co ci i— cm m lo> » » -.. gj Ci 'o in oi < uv o» > • • Cl! O m — CM < » z m O o ►- f- 3 ► < c M O c — o «» ja « ■* r- r-' ex -* Z »l UJ • UJ -1 r- CI z o: or -* ►» c> < CM ui r^ O. -< ► gjl «ci or » u co UJ Cl; — — 1 K -OI t- » uj ►! « oo X CM X en 0> « <-" O CI o Z r- ci * m o -, « ► • »- »ICI N> 3 OO CI O < ci ci * m or uj o It- 00 Z CM < 00 H » O «M « • — O * CI ♦ • * o 0» — CM u z UJ UJ »- ►-' a a a < 0. o. X X _i X Z or > > o CM o u C- p- «- ►- r- K •x X ■ ■ » * « * * • * • » * • * • « • * • « # « * * • • « • « * • « * • « • • • « • • * * • * « • * * * » • * * • • * » * * • • » • • * * « » • • « • • • • « • » • * * • m in O CM 1 * »r o> o CM rMfl p4 CM 1° o —4 "■' 00 00 i -0 CM i CM CO : i ~ 28 o m O UJ X ec z ec Z UJ un »— -J LU «» o O O z Z UJ ** UJ z «/> a o ^*- UJ •-« h- U» —1 < ai < o ec » -J -J u. a < m z z _J «a a < UJ <*1 X -- UJ X ^« o K UJ » UJ 3 » — i C 00 ec o O » U"i ■— ■— r- > tf> ot h- » QC •■ ►- >c U,' — a. u. z r a o. UJ a. ec X c UJ o u_ 1— f*sj ^> ►- X * « * >» * * • * o z -J o cy c .* r- /* HTEXP */ 29 AGGREGATE LENGTH TABLE STATEMENT NO IDENTIFIER 23 ARRAY 645 ARYA 645 ARYB 18 RKCOL 18 BKROW 268 CLS 647 OIVSW 23 EXP 3 INSTNG 647 MISW ?68 NH 145 HP 3 OLSTNG 647 PMSW 2 ST&NG ■*2 TABLES 5 93 TFMP 145 Tl nc 89 TMP 145 TVP 20 WGHT LENGTH IN BYTES 6400 4000 4000 60 60 19 4 6 40C 400 4 6 120 400 4 400 40 320 20 20 20 10 130 LIST OF REFERENCES (1) Baer, J.L., "Graph Models of Computations in Computer Science," Ph.D. Dissertation, Department of Computer Science, Univ- ersity of California, Los Angeles, California, Report No. 68-46 ( 1968). (2) Baer, J.L., and Bovet, D.P., "Compilation of Arithmetic Expres- sions for Parallel Computations," Proc. of IFIP Congress , pp. 340-346 ( 1968). (3) Barnes, G.H., Brown, R.M., Kato, M., Kuck, D.J., Slotnick, D.L., and Stokes, R.A., "The I Iliac IV Computer," IEEE Trans . on Computers , vol. C-17, pp. 746-757 (1968). (4) Bingham, H.W., Riegel, W.E., and Fisher, D.A., Control Mechanisms for Parallelism in Programs , Burroughs Corporation, Paoli, Pennsylvania, Report No. ECOM-02463-7 (1968). (5) Breuer, M.A., "Generation of Optimal Code for Expressions via Factorization," Comm. ACM , Vol. 12, pp. 333-340 (1969). (6) Conway, R.W., Maxwell, W.L., and Mi I I er,. L.W. , Theory of Sched- ul ing , Addison-Wes ley Co., New York (1967). (7) Dennis, J.B., "Modular, Asynchronous Control Structures for a High Performance Processor," ACM Record of the Project MAC Conference on Concurrent Systems and Parallel Computation , pp. 55-80 (1970). (8) Flynn, M.J., "Very High-Speed Computing Systems," Proceed i ngs of the IEEE , Vol. 54, pp. 1901-1909 (1966). (9) Han, J.C., "Tree Height Reduction for Parallel Processing of Blocks of FORTRAN Assignment Statements," Report No. UIUCDCS-R-72-493, Department of Computer Science, University of I I linois, I I linois ( 1972). (10) Hao, Nguen Pa, "Algorithms for Planning Parallel Operation of Installations of a Computing System," U.S.S.R. Comput. Math . and Math. Phys ., Vol. 6, pp. 187-202 (1966). (11) Hellerman, H., "Some Principles of Time-Sharing Scheduler Strategies," IBM Systems Journal , Vol. 8, pp. 94-117 (1969). 31 (12) Hu, T.C., "Parallel Sequencing and Assembly Line Problems," Operations Research , Vol. 9, pp. 841-848 (1961). (13) Johnson, S.M., "Optimal Two- and Three-Stage Production Schedules with Set-Up Time Included," ed . by J.F. Muth , et al., I ndus- tria I Schedu I i ng , Prentice-Hall, Inc., New Jersey, pp. 13-20 (1963). (14) Johnson, S.M., and Arrow, K.J., "A Feasibility Algorithm for One-Way Substitution in Process Analysis," ed. by K.J. Arrow, et a I . , Studies in Linear and Non-Linear Programming , Stanford Univ. Press, California, pp. 198-202 (1958). (15) Johnson, S.M., and Dantzig, G., "A Production Smoothing Problem," ed. by H.A. Antosiewicz, Proc. of the Second Symp. in Linear Programming , USAF, Vol. I, pp. 151-176 (1955). (16) Karp, R.M. and Miller, R.E., "Properties of a Model for Parallel Computations," SIAM Journal of Appl. Math ., Vol. 14, pp. 1390-141 I ( 1966). (17) Karp, R.M. and Miller, R.E., "Parallel Program Schemata," J. of Computer and System Sciences , Vol. 3, pp. 147-195 (1969). (18) Kauffman, A., and Desbazeille, G., The Critical Path Method , Gordon and Breach, New York (1969). (19) Kraska, P.W. , "Array Storage Allocation," Department of Computer Science, University of Illinois, Illinois, Report No. 344 ( 1969). (20) Kuck, D.J., "Illiac-IV Software and Application Programming," IEEE Trans, on Computers , Vol. C- 17, pp. 758-770 (1968). (21) Kuck, D.J., Muraoka, Y, and Chen, S.C., "On the Number of Opera- tions Simultaneously Executable in FORTRAN-Like Programs and Their Resulting Speed-up," submitted for publication (1972). (22) Lehman, M. , "A Survey of Problems and Preliminary Results Concerning Parallel Processing and Parallel Processors," Proc. of the IEEE , Vol. 54, pp. 1889-1900 (1966). (23) Martin, D.F. and Estrin, G., "Models of Computational Systems," IEEE Trans ., Vol. EC- 16, pp. 70-79 (1967). (24) Maruyama, Kiyoshi, "Parallel Methods and Bounds of Evaluating Polynomials," Department of Computer Science, University of Illinois, Illinois, Report No. 437 (March, 1971). 132 (25) Morrison, P. and Morrison, Emily, Charles Babbage and His Calcu- lating Engines , Dover Pub I i cations, Inc. , New York ( 196 I ) . (26) Munro, I., and Paterson, M., "Optimal Algorithms for Parallel Polynomial Evaluation," IBM Research , RC 3497 (August, 1971). (27) Muntz, R.R., Coffman, E.G., Jr., "Optimal Preemptive Scheduling on Two-Processor Systems," IEEE Trans, on Computers , Vol. C-13, pp. 1014-1026 ( 1969). (28) Muraoka, Y. , "Parallelism Exposure and Exploitation in Programs," Ph.D. Dissertation, Department of Computer Science, Univer- sity of Illinois, Illinois, Report No. 424 (1971). (29) Muraoka, Y. and Kuck, D.J., "On the Time Required for a Sequence of Matrix Products," submitted for publication (1972). (30) Pan, V. Ya., "Methods of Computing Values of Polynomials," Russian Mathematical Surveys (English), Vol. 21, pp. 105-136 (31) Ramamoorthy, C.V., Chanely, K.M. and Gonzalez, M.J., Jr., "Optimal Scheduling Strategies in a Multiprocessor System," IEEE Trans, on Computers , Vol. C-21, pp. 137-146 (1972). (32) Rodriguez, J.E., "A Graph Model for Parajlel Computation," Ph.D. Dissertation, Department of Electrical Engineering, M. I .T. Report MAC-TR-64 (1969). (33) Russel, E.C., Automatic Program Analyses , University of California, Los Angeles, California, Report No. 69-72 (1969). (34) Schwartz, E.S., "An Automatic Sequencing Procedure with Application to Parallel Programming," J ACM , Vol. 8, pp. 513-537 (1961). (35) Stone, H.S., "One-Pass Compilation of Arithmetic Expressions for a Parallel Processor," Comm. ACM , Vol. 10, pp. 220-223 (1967). (36) Thompson, R.M. and Wilkinson, J. A., "The D825 Automatic Operating and Scheduling Problem," ed. by S. Rosen, Programming Systems and Languages , McGraw-Hill, New York, pp. 647-660 (1967). (37) Tomasulo, R.M., "An Efficient Algorithm for Exploiting Multiple Arithmetic Units," IBM J. of R. and P ., Vol. II, pp. 25-33 (1967). 33 VITA The author, Paul William Kraska was born in Rochester, New York on 21 April, 1938. He received his AB in Mathematics from the University of Rochester in 1959. He was employed by the Burroughs Corporation, Paoli, Pennsylvania from 1959 to June, 1970, with an education leave of absence from September, 1964, to June, 1965 in order to attend the Moore School of Electrical Engineering of the University of Pennsylvania. He received his MSE in Computer, Information Science from the University of Pennsyl vania in 1969. In June, 1968 he began working with the Illiac-IV pro- ject at the University of Illinois, and attended the University of Illinois as a candidate for the Ph.D. degree. He has been a graduate research assistant since June 1970 in the Department of Computer Science. He is a member of the Association of Computing Machinery and of the honorary fraternity, Sigma Xi . )GRAPHIC DATA 1. Report No. UIUCDCS-R-72-518 j and Subtitle rallelism Exploitation and Scheduling 3. Recipient's Accession No. 5. Report Date May 2k, 1972 6. ior(s) ill W. Kraska 8. Performing Organization Rept. No - UIUCDCS-R-72-518 orming Organization Name and Address iversity of Illinois at Urbana-Champaign partment of Computer Science bana, Illinois 61801 10. Project/Task/Work Unit No. 11. Contract /Grant No. US NSF GJ 27^6 insoring Organization Name and Address tional Science Foundation shington, D.C. 13. Type of Report & Period Covered Doctoral - 1972 14. .plementary Notes tracts parsing algorithms are developed such that syntactic tree -height is minimized, educed, with respect to application of the associative, commutative, and distri- ve (hut not factoring) laws of arithmetic, on arithmetic expressions composed of -formed sequences of the symbols add, subtract, multiply, divide, scaler identi- , and assignment, where it is assumed that a unique weight (i.e., program execu- ting) may be associated with each symbol. A parsing algorithm is also developed : that syntactic tree-height is minimized, with respect to application of the ::iative law, on expressions composed of a conformable sequence of matrix products, 2 the matrices are not necessarily square, and such that the overall number of iter operations is minimized if tree-height is not affected. A new non-preemptive iluling algorithm of a weighted-node acyclic dependency graph, having n nodes, on a \im of k equipotent machines is developed such that all nodes are processed in the 1; amount of time. The assignmerrt process requires 0(n 2 ) computer operations. A nigorithm to determine k* = LUB(k), such that the graph may be processed in the . Leal time, is also presented. Words and Document Analysis. 17a. Descriptors : tactic tree-height fghted operators *rix Product Sequence cipment Parallel Processors ( eduling, non-preemptive eghted-node Dependency Graph lntifiers/Open-Ended Terms Parsing Algorithms Arithmetic Expressions Critical Path Method Critical Time C'.ATI Field/Group viability Statement RELEASE UNLIMITED N S-35 (tO-70) 19.. Security Class (This Report) iSIFIED UNCLASSIFIgr cunty Class (Tr 20. Security Class (This Page UNCLASSIFIED 21. No. of Pages 133 22. Price USCOMM-DC 40329-P7 1 SEP 2 1 tare BB H to ,0^ ■ SSSSS5S5 ■ Ml I i * I * 1 ■ 1* , ' I I h ■ ■1 ■ ■ i lis ■ ■ bbbhkj mnu ■■■■ HmxI Kit air $ 7 ■' ■ t I *'M+\ I mffl m m EH B I ■ ■ &«■■■■ H SHI H