on LIBRARY OF THE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAICN no. 2>2>l-33& cop. 2 The person charging this material is re- sponsible for its return to the library from which it was withdrawn on or before the Latest Date stamped below. Theft, mutilation, and underlining of books are reasons for disciplinary action and may result in dismissal from the University. UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN L161 Digitized by the Internet Archive in 2013 http://archive.org/details/automaticintegra336mcca 4-33 1 Report No. 336 /fidgety AN AUTOMATIC INTEGRATOR FOR ORDINARY DIFFERENTIAL EQUATIONS FOR ILLIAC IV by Thomas McCarthy- June 1, I969 THE LIBRARY OF THE j69 Report No. 336 AN AUTOMATIC INTEGRATOR FOR ORDINARY DIFFERENTIAL EQUATIONS FOR ILLIAC IV by Thomas Edmund McCarthy- June 1, I969 Department of Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois 6l801 * This work was supported in part by the Advanced Research Projects Agency as administered by the Rome Air Development Center under Contract No. US AF 30(602)klM and submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science, June, I969. 11 ABSTRACT This report describes the design and implementation of an automatic integrator for systems of ordinary differential equations for use on the ILLIAC IV computer system. This integrator accepts a simple statement of the differential equations and their initial values and produces an ILLIAC IV Assembly Language code which can be used to solve them. The intent is to produce a code which will be efficient on a large parallel computer. Ill ACKNOWLEDGMENT The author would like to thank Professor D. J. Kuck for his advice and council on this thesis. Special words of appreciation should be given to Mr. Jacques LaFrance and Mr. Nelson Machado for their assistance in using the Translator Writing System. The author would also like to express his appreciation for the financial support of ILLIAC IV and the Department of Computer Science for this work. Finally, the author would also like to express his appre- ciation to Mrs. Kay Flessner for her speedy typing of this thesis. IV TABLE OF CONTENTS Page 1. INTRODUCTION 1 2. USE OF THE INTEGRATOR SYSTEM 6 3. IMPLEMENTATION 12 3*1 An Overview of the System. 12 3«1«1 The Recognizer 12 3«1«2 The Semantic Routines 12 3*2 Methods for Improving Parallelism 13 3*2.1 Possible Instructions and Operands Ik 3« 2. 2 Doing Similar Instructions in Parallel 14 3-2.3 Term by Term Analysis 16 3«2.U Inserting Dummy Instructions 17 3-2.5 Sharing Terms 18 3»3 The Optimization Routine . 19 3*^- Methods of Integration 28 k. EVALUATION AND SUGGESTIONS FOR FURTHER DEVELOPMENT OF THE INTEGRATOR 31 APPENDIX A. SAMPLE INPUT PROBLEM FOR THE INTEGRATOR -TRANSLATOR SYSTEM (OUTPUT LISTING) 36 B. ILLIAC IV ASSEMBLY LANGUAGE OUTPUT PRODUCED BY THE INTEGRATOR FOR THIS SAMPLE PROBLEM 1+2 C. BNF INPUT SPECIFICATIONS FOR THE INTEGRATOR !+9 LIST OF REFERENCES 51 V LIST OF FIGURES Page Figure 1. Schematic Diagram of integrator System 5 2. Input Language Description in BWF 7 3- Sample Input 9 k. Optimization Routine 25 5. Comparing Compstk and Codestk BLOCK A in PREVIOUS FLOWCHART 26 6. Routine to Compare COMPSTK with Another Stack 27 1. INTRODUCTION Standard methods of integrating ordinary differential equations consist essentially of incrementing the independent variable by some small amount and then, by using past values and derivatives of the de- pendent variable, predicting the new value of that dependent variable. Thus the integration proceeds from the given initial value of the de- pendent variable to some user specified stopping point, building on itself as it goes. The same method is applied to systems of ordinary differential equations; but, of course, one cannot integrate one equa- tion completely and then the next, since in general, the derivative of each dependent variable will depend on the other dependent variables of the system. Each equation must be integrated at a given value of the independent variable before going to the next step. For this reason, to perform an integration of a system on a large parallel computer like ILLIAC IV, it is desirable to perform the integration on all of the equations simultaneously. Difficulties arise when the equations are not identical in form; and, of course, in general they are not. Integration methods consist of one or more func- tion evaluations; that is, substituting the present value of the depen- dent variable into the equation to find its derivative at the given point. This is the part that is most difficult to do simultaneously with dissimilar equations, and it is the part which usually accounts for much of the computation time, even on a serial machine. The following example will illustrate some of the problems to be solved: y l = a y i y 2 + b y i + C y l y l/ y 2 y 2 = d y 2 + e y l y 2 + f y i y 2 The logical way to proceed would seem to be to try to minimize the number of nonparallel arithmetic operations which must be done. For an example of how this might be done, consider a two processing element machine with each processor working on a single equation. Assume that by allocating storage properly and by using some indexing (to be explained later), it is possible to access constants and different dependent vari- ables simultaneously. The operation sequence would then be as follows (parentheses indicate memory access, I indicates processor idle, arith- metic operations indicated by their symbols). PI (a) xx (b) x I + (c) x x - + P2 (d) x I (e) x x + (f) x x I + If the terms can be rearranged so that the first term of the first equation and the second term of the second equation can be done at the same time, some more savings can be gained. PI (a) x x (b) x + (c) x x - + P2 (e) x x (d) x + (f) x x i + Thus more operations are done in parallel and the processors are idle fewer times. With larger systems of equations there are more terms that can be done in parallel and more comparisons to be made. With rearrangement, unmatched terms could also be done partly in parallel. With the larger number of terms in big systems, it is reasonable to expect that the relative savings will be greater than those in this example. Another problem would arise if one of the equations had sig- nificantly more terms than the other. Some method of sharing the work between processors should be used. As the system of equations gets larger in number of terms and number of equations, all of these problems become more and more complex. The translator-integrator program described in this thesis tries to do these types of optimizations as well as compiling the Assembly code to evaluate the input equations. The translator-integrator (which will be referred to as the integrator in the following pages - the section designated should be clear from the context) has associated with it some skeleton forms of ILLIAC IV Assembly Language programs to solve ordinary differential equations by different numerical methods. These, along with the compiled code and generated storage pseudo-ops allows the program to produce a completed Assembly code to solve the equations. The translator-integrator system itself is an Algol program which would be executed on the Burroughs 65OO in the ILLIAC IV system. A schematic of the integrator is shown in Figure 1. Section 2 describes the integrator from the user's point of view. Section 3 is a discussion of the implementation techniques and numerical methods used. Section h discusses the capabilities of this system and the further work which might be done. Differential Equations and other data 1 Recognition Optimization Assembly Code to solve Equations SKELETON programs for numerical methods Figure 1. Schematic Diagram of Integrator System 2. USE OF THE INTEGRATOR SYSTEM The primary object of any automatic integration system is to provide the user with a device to solve his equations with a minimum of effort and analysis. From the user's point of view, the integrator input is as defined in Backus Normal Form (BNF) in Figure 2. Most of this should be self-explanatory. Dependent variables are expressed as Y(N); their derivatives as Z(N), and the independent variable as T. Each differential equation is written as z(n) = f n (y(i), y(2), . . . , t) The functions can be almost any Algol expression (an obvious exception is an if clause) as defined by the BNF and is written in B65OO Algol notation. It should be noted that constants whose values can be assigned later are allowed in these expressions. The initial values of the dependent variables and the starting and final values of the inde- pendent variable are then given. These are real numbers expressed in Algol notation. All numbers except indexes are treated as 6U-bit floating point numbers by the integrator. It is not necessary to ter- minate whole numbers with a decimal point to insure that they are floating point. TO and TF are the starting and ending values of T, the independent variable. The user can also either specify a stepsize to be used or specify a desired maximum error which the integrator will try to stay : := differential equations> END : := Z( ) = ; Z( )= ; : := Z0( ) = ; Z0( ) = ; : := = ; = ; : := ; | ; : := TO = | TF = | STEP = ERROR = : := | | : := + < - : := | : := x | / : := | * : := | | Y( ) T ( ) () : := SIN I COS I EXP I LOG Figure 2. Input Language Description in BNF within. Also any constants which were used in the equations should he defined. The program input is terminated with an END. In general the input is free form with semicolons separating statements. In fact, a missing semicolon will not always cause an error, but it is suggested that the user try to make the input no more diffi- cult to recognize than necessary. Identifiers are limited to six characters; longer ones are truncated on the right. Numbers are limited to eighteen characters including sign and exponent. All blanks are ignored. From this minimal amount of information an Assembly Language Code will be produced and placed as a file on disk. If some information, such as an initial value or constant declaration is omitted, a code will still be produced with zeros in the appropriate locations. In Figure 3, a sample program for the example discussed in the introduction is shown as it would be inputted to the translator. A larger example, complete with output, is given in the Appendix. The integrator also has some control words which can be used if desired. These are entered at the beginning of the input. The string of control words separated from each other by at least one space must be preceded by a '#' and a space. The control words are as follows: TEST - Do not generate Assembly Language -Code. This control word is used when the user is only interested in knowing what efficiency can be achieved with his equations. # RUNGE ERRORCHECK ; Z(l)= AxY(l)xY(2) + BxY(l) + CxY(l)xY(l)/Y(2) ; Z(2)= DxY(2)+ExY(l)xY(2) + FxY(l)xY(2) ; ZO(l)= ; Z0(2)= 2.3 ; A = 3.^56 ; B = ; C = 1 ; D = 3-^5677U836^5@-09 ; E = I+.9 ; F = .33 ', TO = 0.0 ; TF = 1.0 ; ERROR = .0001 ; END Figure 3- Sample Input 10 RUNGE - Use a Fourth-Order Runge Kutta method to integrate the equations. This is the default option. CORPRE - Use a Corrector-Predictor method to integrate the equations. This is a more complicated method but could be used on more difficult equations. ERRORCHECK - Estimate truncation error and adjust stepsize accordingly. This control word should be used when an error bound is to be put on the results as previously described. CONSTMAP - This prints out a chart showing the location of constants in PE memory, so that they can be changed without reexecuting the integrator. FUNMODE - If the terms in the differential equations are predominantly made up of parenthesized expressions or function calls (i.e., SIN, COS, etc.) or both, then this option should be turned on to improve the efficiency of the code generated. PATTERN - The integrator makes some of its decisions based on the supposition that the terms in the first few equations are representative types for the whole system. If this is not so for a particular system, the efficiency may be improved by using this control word. When this option is used, the first equation should contain terms typical of the system and in about the same ratio to the other terms in the equation as they occur in the system. No code will be generated for this equation, but it will aid the integrator in allocating the work to be done. The Assembly Language Program which is produced also has some comments in it so that with the information provided by the CONSTMAP option of the integrator, the user should be able to make small changes, such as the values of constants or initial values, without rerunning the program if desired. An alternative to this would be to 11 read in data at execution time. When the input conventions for ILLIAC IV are defined, the integrator can easily be modified to load the data into the correct place. 12 3 • IMPLEMENTATION 3.1 An Overview of the System 3.1.1 The Recognizer The recognizing part of the integrator program was written using the ILLIAC IV Translator Writing System (TWS). The input to one of the TWS programs is a Backus Normal Form (BNF) description of the in- tegrator input language syntax. This program, which runs on the Burroughs 5500, generates an Algol scanner and parser which can be used to recog- nize programs written in the language defined by the input BNF. An action number can be associated with each production to execute a designated semantic routine. A TWS program is also provided to con- catenate these semantic routines to the parser. While this program also provides several built-in routines designed to help write compilers, few of these were actually used due to the specialized nature of the integrator language. All of the analysis, reduction and code generation for the integrator is done within the semantic routines. The actual input to the TWS programs is somewhat different than that given in Figure 2. Action numbers are required, and the BNF description is slightly different to aid the syntactic analysis. 3.1.2 The Semantic Routines Before the parsing of the input begins, a short, fast prepass is made in the initial semantic routine to count the number of equations 13 and number of terms (by counting arithmetic operators). Since the input to the integrator may be quite large, it is necessary to do optimization and the related compilation and storage allocation as the program proceeds, so as to limit the amount of B65OO storage needed. The information gleaned from this prepass is needed to do this dynamic analysis. As the input differential equations are parsed, an inter- mediate code for evaluating them is generated. The optimization is then performed on a term by term basis on this intermediate code. Storage assignment begins at the first processing element and then proceeds according to the types of terms and equations involved in the problem. Other types of statements such as initial values and constant declarations are recognized, and the information is saved. When the entire input has been recognized, all the necessary information for Assembly code generation has been formed. At this point the appropriate skeleton ILLIAC IV integration program is selected, and the storage declarations and the code for evaluating the equations are inserted in the proper places to form a completed Assembly Language Code. This is left as a disk file which can be assembled to machine code. 3.2 Methods for Improving Parallelism 3.2.1 Possible Instructions and Operands The instructions used to evaluate an equation are a small and standard set; for example, load register, add to register, multiply, 1U store, etc. Thus the operands that will be used also constitute a fairly limited set. In general, they can be divided into the following classifications : 1. CONSTANTS - These values do not change during the integration. They may have been declared either as literals or by constant declaration statements. 2. DEPENDENT VARIABLE - These have an index value associated with them declared in input language as Y (). 3- FUNCTION CALLS - These include any of the explicitly allowed functions such as SIN and COS. k. INDEPENDENT VARIABLE - This is designated by T in the input language . 5- REGISTER NAME - These occur in register to register instructions used for temporary storage of data. 6. NO OPERAND - This is for such instructions as change the sign of the A register (CHSA). 3.2.2 Doing Similar Instructions in Parallel First consider the same instruction operating on different operands which fall into the same operand class. Methods must be developed for doing these in parallel. When the operand is a constant, as many consecutive rows of PE memory as needed are blocked off. When the Assembly code is gen- erated, instructions which access constants are given operand fields such that they point to consecutive rows in this array. Each processor can have a different value in its memory position in the constant row. 15 It is a simple matter for the integrator to take the values defined as literals or by constant declarations and insert them in the correct row and column so they are accessed by the correct instruction. The values of the dependent variables are not stored in the same way as constants since they must be updated constantly throughout the execution of the integration to be performed. The values of the independent variables that are needed are kept in each PE in a column. The way to access different Y's in parallel is to associate a tag row with each access in a way similar to that used for constants. In this case the tag row is loaded into the X register when needed and a load modified by index is made from the independent variable storage area. While this costs an instruction to load the X register, it really only costs /m, where m is the number of equations, since the indexes for all of the variables are loaded at once. Also in a serial machine these variables would often be declared as an indexed array. At the present stage of design, it is not possible to execute different function calls in parallel, since a function call is in reality a reasonably large set of instructions to be inserted in this position, and the processing elements must execute the same instructions. Since most functions form some sort of polynomial approximation, it would be a reasonable extension of the system to have the integrator provide its own library functions which have been written so they are as parallel as possible and thus gain some speed. Since there is only one independent variable, instructions with it as an operand can always be performed in parallel. The value is 16 kept in all PE's in the same memory row and updated when appropriate. No operand instructions can clearly always be done in parallel. Instructions with different register names as operands clearly cannot be done in parallel. Since these only occur in tempo- rary storage manipulation according to a well-defined algorithm, similar terms will probably access the same registers anyway. Also, due to the method used for sharing terms described in a later section, little temporary storage will be used. 3.2.3 Term by Term Analysis In order to compare equations to see what can be done in parallel, it is necessary to break them down in to more convenient and hopefully more similar parts. The unit selected was that of a term since these can be shared to other processors and then recombined with just an addition operation. The expressions representing the differ- ential equations are only broken down into the first level of terms; that is, terms inside parentheses are not separated. Some examples are listed below. The expressions on the left are broken down into the terms (separated by commas) on the right. The expression can be re- formed by just adding the terms on the right. A + B + C ► A, B, C A + B - C — ► A, B, -C A + (B-C) ► A, (B-C) AxY(l) + BxY(2) x Y(3) + SIN (Y(U)) AxY(l), BxY(2) x Y(3), SIN (Y(k)) 17 Of course, one can certainly write separate terms according to this rule which differ tremendously, but it is reasonable to expect that in a large system of equations many terms will be similar. Further discussion of the integrator should make obvious the convenience of this division. Thus if all of the instructions in two terms are the same and the operands fall into the same classes as defined by previous remarks, then these terms can be done in separate PE's simultaneously. 3-2.4 Inserting Dummy Instructions In actuality it cannot be assumed that all terms will be per- fect matches with other terms. Consider the following terms: Ax Y (1) x Y (3) Bx Y (3) They differ only in one multiply; and if no other better pairing is available, this can be treated as Bx Y (3) x 1. In actuality it is treated as Bx Y (3) x Y (0) where Y (0) always contains a one. Thus these terms can be paired. Other dummy instructions can be in- serted as follows : Y (1) x B "\ Y (1) x B Y (2) /" Y (2) x 1 SIN (A +Y(1)^ SIN (A + Y(l)) SIN (B) / SIN (B + 0) (C + 0) Inserting dummy operands for instructions involving other than constants and dependent variables does not appear worthwhile since extra instructions would be needed. Since a function call usually in- volves several instructions, it is not advisable to have a dummy call unless no better match can be found. 3.2.5 Sharing Terms As previously mentioned, one of the important problems is that the equations may vary greatly in the number of terms in each. Some method is needed to share the work more or less equally between pro- cessors. This is the place where the information gained in the prepass becomes necessary. A limit is set on the number of terms to be evaluated in each PE according to the following equation: maximum number I total number of terms E of terms / PE total number of PE's The square brackets indicate the smallest integer greater than the enclosed value. This limit serves to distribute the terms equally across the PE's. When the optimization has assigned the maximum number of terms to a PE, it then skips to the next. This requires that a tag be associated with each term indicating which equation it belongs to. 19 After a term has been evaluated, it is added to a scratch memory loca- tion in the same PE. This location depends on the tag so that at the end of the evaluation, there may be several separate partial sums in a particular PE. After all partial sums are formed, routes and adds are performed which combine partial sums and return them to the correct PE's. Also, if the number of PE's is much greater than (i.e., more than twice) the number of equations, the integration procedures other than function evaluation are assigned to every other one or two PE's sequentially so that the amount of routing necessary will be reduced. Assigning the PE's sequentially assures that the routes will be small since the terms are also assigned to the PE's more or less sequentially within the range of the optimization routine. 3-3 The Optimization Routine The optimization routine is the procedure which is called when a term has been recognized. Its purpose is to determine what term this just recognized term can be done in parallel with. Flowcharts are given in Figures k-6. There are three basic stacks. COMPSTK is the stack containing the intermediate code for the term just recognized. CODESTK contains the intermediate code for the terms which are to be shared over the processors as previously discussed. EXCODESTK contains the code for the terms which cannot be matched with those in CODESTK. In the normal state of operation, terms which contain function calls and parenthesized expressions are put in this stack since they are expected 20 to be the exception and not the rule. This can be changed by the user if desired with the control word FUNMODE. Optimization does occur with- in EXCODESTK as well as CODESTK. Basically the optimization routine executes a series of calls to a comparison routine with different arguments. The arguments tell this comparison routine which stacks to compare and how many dummy in- structions it can insert in each in order to make terms match. The flowcharts show the sequence of these calls. The purpose of the partic- ular order is to find the match which requires the least overhead first. When the maximum number of terms allowed in CODESTK is reached, the data for storage generation is extracted for that PE, and the program proceeds to the next. CODESTK and EXCODESTK are overwritten for each PE since the basic instruction and operand will be the same if a comparison is achieved. Only the storage information contained in the intermediate code is different. The pattern for the intermediate code given below illustrates this. bit 12 18 ^3 A B C 6 bits 2k bits 6 bits A - code number indicating the instruction (LDA, ADRN, . . . ) B - storage information C - type of operand 21 If C is -> then B contains No operand Constant Dependent Variable Independent Variable Function Call Register to Register Arithmetic computation temporary storage - > -► symbol table pointer ■^ index value * "^ Name of function "^ Source register i name +■ index within storage Exponentiation temporary storage ► index within storage The code remaining when the final equation has been optimized is used to form the ILLIAC IV code itself. An example of the previously discussed case follows: EQUATIONS: Z(l) = AxY(l) x Y(2) +■ BxY(l) + CxY(l) x Y(2) / Y(2) Z(2) = DxY(2) + ExY(2) x Y(2) + FxY(l) x Y(2) DETERMINED IN PASS 1: MAXIMUM TERMS PER PE =3 NUMBER OF EQUATIONS = 2 22 NOTE: ASSUMING A 2 PE SYSTEM PASS 2: 1. FIRST THREE TERMS ARE READ IN TO FILL UP CODESTK. STORAGE ALLOCATION FOR THE FIRST PE IS DONE. TERM #3 *~x~* *** A A A A A A CODESTK TERM #2 TERM #1 2. READ NEXT TERM IN INPUT AND PUT IN COMPSTK. x Y(2) D **-*-*-** COMPSTK 3. COMPARE COMPSTK TO CODESTK WITH NO DUMMY INSERTIONS ALLOWED o k. FIND A MATCH WITH TERM 2. 23 5. COMPSTK IS OVERWRITTEN INTO CODESTK. FLAGS ARE SET INDICATING THAT THIS TERM IS TAKEN. 6. NEXT TERM IS READ INTO COMPSTK. x Y(2) x Y(l) E xxxxxx COMPSTK COMPARE COMPSTK TO CODESTK WITH NO DUMMY INSERTIONS ALLOWED. 8. FIND A MATCH WITH TERM #1. 9. OVERWRITE AS IN STEP 5. 10. READ IN NEXT TERM. x Y(2) x Y(l) F XXXX -tt-X- COMPSTK 11. COMPARE WITH REMAINING TERM IN CODESTK - NO INSERTIONS. 12. FAILURE - TRY TO COMPARE ALLOWING ONE INSERTION 13. SUCCESS 2k I Y(0) x Y(2) x Y(l) Note: Y(0) always contains a 1. V V V V V d JTWIT A K A COMPSTK 1^. OVERWRITE AS IN STEP 5. / Y(0) x Y(2) x Y(l) F x Y(2) D %* \* %» \» %+ \t ™ A A A A A x Y(2) x Y(l) E V/ \I \S \f \f 1/ A A A A TV A CODESTK 15. USE CODESTK TO ALLOCATE STORAGE FOR THIS PE. 25 ENTRY NO DOES COMPSTK CONTAIN A FUNCTION CALL OR PARENTHESIZED EXPRESSION COMPARE COMPSTK AND EXCODESTK SUCCESSFUL? COMPARE COMPSTK AND CODESTK SUCCESSFUL IN THIS PE? A NO SUCCESSFUL IN NEXT PE? YES NO COUNT AS ONE FAILURE YES YES SAVE COMPSTK UNTIL NEXT PE OVERWRITE COMPSTK INTO CODESTK SUCCESSES + FAILURES > MAXIMUM # TERMS PER PE? NO NO ADD COMPSTK TO END OF EXCODESTK EXIT OVERWRITE COMPSTK INTO EXCODESTK -►EXIT ■►EXIT YES SAVE STORAGE DATA FOR CODESTK AND EXCODESTK. INCREMENT PE COUNTER. INSERT ANY CODE IN LOOK AHEAD FROM LAST PE. Figure k. Optimization Routine ENTRY 26 NO L — ► NO I NO J ► IS THERE A MATCH WITH NO INSERTIONS JffiS_ NO MATCH WITH 1 DUMMY INSERTION INTO COMPSTK? 2, N-l? N JES_ _YES_ J£ES_ NO IS EQUATION COUNT LESS THAN 1/6 OF TOTAL NUMBER. ML YES MATCH WITH 1 DUMMY INSERTION INTO CODESTK? 2, . . ., m-1 M YES YES vps JSQ_ _^exit i -►EXIT 1 >EXIT 1 _^EXIT 1 EXIT 2 -► EXIT 1 +.EXIT 1 ^.EXIT 1 -► EXIT 2 Figure 5„ Comparing Compstk and Codestk BLOCK A in PREVIOUS FLOWCHART 27 ENTRY, DOES THE INSTRUCTION HAVE AN INDEPENDENT VARIABLE OR A CONSTANT AS AN OPERAND? NO YES COMPARE ONLY BITS 12-17 AND U2-l*7 WITH STACK. SAME? NO SET COUNTER 5 TO TEST SAME INSTRUCTION IN OTHER STACK WITH NEXT IN COMPSTK COMPARE ALL BITS WITH YES STACK. SAME? NO ► FAILURE TEST NEXT INSTRUCTION _IES_ M 0? YES [no FAILURE M > 0? YES NO i INSERT DUMMY INTO OTHER STACK, m = m +- 1 INSERT DUMMY INTO COMPSTK m = m - 1 Figure 6. Routine to Compare COMPSTK with Another Stack. M is the num- ber of insertions of dummy instructions' allowed. 26 3.U Methods of Integration At present there are two integration methods available for the integrator. These are Fourth Order Runge Kutta and a corrector- predictor method investigated by Nordsieck. The standard Fourth Order Runge Kutta method is used. This is as in the equations below. K ! = F (Y n , T) K 2 = F (Y n +.j HK 1 , T + H/ 2 ) K 3 = F (Y n + |riK 2 , T + H/ 2 ) K^ = F (Y n + HK 3 , T + H) Y n + l = Y n4 H ^ ta 2. + 2 S*S l ) If automatic stepsize adjustment is called for, spare PE's, which are not used for the integration, are used to do the integration at twice the stepsize. The truncation error can then be estimated from the following formula. E = Kh 5 = ^ry (h/ 2-y (h) 1 31 I y m * y m Stepsizes are always changed by a factor of two and certain limitations are put on the changing frequency and minimum- size of the step. 29 The corrector-predictor method is a multistep method and thus requires another method to initialize it. The midpoint rule is used in the integrator. When the starting values y _ and y _ are ''n-l n-2 found, they are premultiplied by a matrix as follows: i " y n 10 hf (y n ) a = -n V k -1 V k hf K- v 6 -v 3 v 6 hf ^n"^ ] The formulas for the integration method are then: -n + 1 = 1 1 1 2 3 1 3 1 a -n + -3/8 -1 -3/1+ -v 6 \ y n' " y n). There are several reasons for using this method. For reasons which are beyond the scope of this explanation, 30 the method minimizes roundoff error. Also, due to the Pascal triangle form of the matrix, the matrix multiplication can be reduced to a series of a few additions thus speeding up the method. The biggest advantage is that changing the stepsize does not require restarting procedures. The vector a simply is multiplied by 1 a 2 a where a old h new 9 "I Each of these methods is stored as a skeleton ILLIAC IV pro- gram which is used to form the complete integrator output. 31 k. EVALUATION AM) SUGGESTIONS FOR FURTHER DEVELOPMENT OF THE INTEGRATOR As might well be expected, the efficiency of the ILLIAC IV program generated by the integrator system varies greatly on the type of input. One would expect it would not do as well on equations with highly complicated and diverse terms, and this proved to be true. Equations such as those involved in chemical kinetics problems where the terms differ only in two or three multiplys were handled very well, approaching the maximum efficiency for this algorithm. The integrator is also not well suited to small systems of equations. There are many reasons for this. Conventional integration methods are rigidly sequential, so the work other than function evaluating can only use as many PE's as there are equations. Attempts to separate methods into usable parallel forms have been unsuccessful to date. This method thus leaves many of the PE's unoccupied when doing the method calculations for small systems. Also, with smaller systems, the ratio of these calculations to the function evaluation is high and so the overall integration becomes inefficient. One way of improving this might be to use some extrapolation method, such as Richardson's. A lower order, and hence faster method, could be used to do the inte- gration, thus decreasing the integration-to-function evaluation ratio. Since this involves doing the integration at different stepsizes, more of the PE's could be performing useful work in parallel. However, due to the sequential nature of the methods, there would still be a lot of PE's empty. For example, consider a three step extrapolation me- halving the stepsize for each integration. The PE versus time graph would look as follows: PE time h h/ 2 h \ ▼ This represents the time involved in all the integration and function evaluation except for the evaluation of the extrapolation formula. In this case 40 per cent of the machine time available is being used. In the best case with only two stepsizes, 25 per cent of the machine time is unused, and the savings gained by an extrapolation from two cases would probably be negligible anyway. One is tempted to move part of the process time from the smaller stepsize to the empty spaces in the larger, but this cannot be done due to the aforementioned sequential nature of the methods. The starting values to begin the integration in the middle are not available until the integration has been performed up to that point. For these reasons extrapolation methods seem of doubtful value. There are other problems which affect small sets of equations. Since there are few terms, it can be expected that, on the average, there will only be a few that can be done in parallel. Thus the function evaluation may be inefficient as well. 33 With larger systems the results are significantly better. There are fewer wasted PE's during the integration operations. There are more terms to he compared and to fill up the machine. It cannot, however, be guaranteed that 100 per cent efficiency will be obtained even if the maximum number of terms can be done in parallel. This is due to the fact that if the number of terms is not an even multiple of 6h, another term evaluation must be performed even if there is only one extra term. The following chart illustrates the maximum efficiency for different numbers of terms. Maximum terms per PE Worst #terms case efficiency Average #terms efficiency 1 1 15$ 32 50$ 2 65 51$ 96 75$ h 193 76$ 22^ 87$ 8 kk9 m i+8o 9^$ 20 1217 95$ 12i+8 98$ This clearly indicates some of the improvement with larger systems. Breaking up large terms into smaller ones might improve this although there will be some added overhead in keeping track of the extra terms and recombining to obtain the results. Another item which causes inefficiency is the routing which must be done to recombine the partial sums after their evaluation. If the equations are roughly all the same length, the routing will be minimal. Strange distributions of long and short equations might greatly increase the routing. An example might be a set where the first thirty- equations have two terms each and the last thirty have twenty terms each. If these same equations were alternated - one short, one long, one short - the routing would he reasonable. It would be desirable to have the in- tegrator system do this type of exchange before analyzing the equations. Of course, an intelligent user could do the same thing. Another factor is the depth of analysis used to find terms that can be done in parallel. Highly complex terms are difficult to handle. Consider the following: Bx (Ax SIN (Y(l)) / 2 + Y (3) * 3) (Dx Y (2) - SIN (Y (6)) x Cx Y (2)) * 6 It is clear that many of these terms could be done in parallel, but highly sophisticated techniques would be needed to do this auto- matically. To do this high a level analysis on every term might take enormous amounts of computer time and might only be worthwhile on smaller systems. As previously mentioned, another useful addition would be to have routines written so that a sine and a tangent, for example, could be done simultaneously. This would seem to be possible since the com- mon methods for obtaining these functions are by using some approximating polynomial. It might be possible to use the same form polynomial with different constants to get different functions. It should be noted that sine and cosine are trivial to do in parallel since cos x = sin (x + ir/2). 35 It goes without saying that an integration method which is less sequential would he valuable in improving this system. In summary, the integrator system works reasonably well on large systems of equations although there is still some improvement possible. For small systems it appears that a different algorithm would need to be used to get really good results. APPENDIX A SAMPLE INPUT PROBLEM FOR THE INTEGRATOR -TRANSLATOR SYSTEM (OUTPUT LISTING) 37 II L I AC IV THANSLATO* W»ITI fi *Y*UM i.nnrs co«pu.fk • vcastnn i n »*or,MM f, AK91L 30» 1969 *T 2013 t «i« /does »LTST HSvil) ♦ P»TTr«M t 7(05 ■ »»v(0)"Yf rt)»P»Vf 0)»Y(0)*C"V(rt)»r(0) I 7<1)«KlKY(A)*A1xY(l)»AA,iY(3)xYM01 + »2xY(1)*»3xYr?>*/U«Y(7)*»S>iYC , S)*«>00J 7(2>«B3xY(i)xY()*H4xY(9)»Y(9)+e>SxYC2)»Y(V)*K9*Y(5)J 7C3)»ClxY(3)>«Y(QWC?«Yr3)xY(9)+C3xYC3)+K3xY(«)xY(«n«K4xY(n*Kft*Yf8)xV(fn ♦ K3*Y(8)xYf3)'A1x*(29)*K5txY(1)xY(2)*K6xYC?w*?xYC3)*Y(«>); ?(«)>UlKVi3)«|)?«V(«|)«Y(9)*0)icY(i)HV(V)' «x7w(t<)i(YrA)«KRKV(3) I 7(5)»K3xYC2)*E1x , "i2)xYr9l4»-2xYfnxY(9)»r3xY(/|)>«vrc))*f-4xY(5urSxY(b) I 7(ft)»f l«Y(?)*Yci)Wf?xYr35KY{9)*FixV(a)xYCV1*r«)(Y(i«,) *f 3hY(A)+ Fl»Y(13)xY(8)*Al«11kYU0)*T9xYi9)xV(S)»T3*Y(9)xY(MI 7C10>»J1«Y(lO)*J , >«Yr10)*Y(7)».l3*YH>0*Y(31*A1 X Yf 0)+K4*Y(1 )*K4«Y(8)xY(9)J 7(131«C1xY(?)xY(0;*C?xY(1)xY(9j»rxY(l)>«Y(9),K6xY(2)*i<2xV(3 1 xY(S)»A2xY(1ft>xY(3 3)*H , >XY(8)xYn«> ♦CJxYC?'>)xY(l^)*Jl¥Y(?«)xY(S)*A?xY(??)*AflxY(P)*K1xY(H)xY(?)l 7(1«>»01xYf3)*n?«YCa)«Y(9)«13xY(0lxY(9) ♦K7xY(»1xY(«)*K8i«Yf3) I 7(l5>aK3»Y(2)*riXYi'3)X>(9)»F9xV(4iXY(V) + rixYf4)«Yf9>*e4xY(5)*F$KY( , >) » 7( 1 61»F t xy( ?)xYf 9) *F?x> ( ^)XY(9)*F 4xY( ft) J 7<17> = (il«r(6)*AlwYi , 7)xY(«)Kj?xY(r>*f;.4xY(7) | 7(1B)«H1xY( 1 ) I 7(19)»Jl*Y(10)*/» , »xY(M«Y(;>3)*J?*YMO>*Y(7)*jT*Yr , i>MY(10)* <».0*-0?xY(3> * ?oo 1 * 300 2 • 100 j * soo 4 » ftOO 5 * 700 6 ♦ 800 7 * 900 8 * 1000 9 * 1100 10 * 1200 1 1 * 1 V»0 12 * 1(100 13 * 1*00 14 * 1600 I 1 ) * 1700 16 * 1 HOO 17 * 19 18 * 9000 19 * 3100 30 * ??oo 21 * 3300 22 ♦ 9100 23 * 'SOO 24 * 9600 25 * 9700 26 ♦ «.. 3«4xY( ?»i»*( >n» 9tt*iA*v;is)i • ?*oo 7(?0)iKi«»(iO)»v(M»K?«»(giivoo 7(?l>"B3«Y<9)»V(1U)*k?«V(?7)*n«Y(10)«l?«'Y(9)»Y(«.)*13i«Y(9)»Tfm • 3O0O 7(??3«C?*V(3)>«V(9)»C?«y(l)*»(9)»C1»'V(i)KVC«1»«4«»f ^»*Tta)»*««»C«3«T(t)l » 31 00 7(?3l"C?xY(3)>«Y(0)*f?xy( 3 )« y(«) »c (*Y(j)»Y(»)«if««Y-f?1i«Y(?S)»K«»Y(«)*vfV) • *?00 ♦ Fl'«Y(ni>«Y(?)»Y(?)*lrY(S)*»1»Y('J)»F«»Y(9)l?xY( ^♦03»Y(i>)*0ft»Y(O)KV( 9i I • *»00 7n v ( « ) *X(jx Y f 3 ) | • 1S00 7(?51»K3xY(?>*F?xY(?1xY( "> ♦ f ?»t( 3 1 » Y ( V > *F3xY(» IxYf 9)*f «xY(S )♦( SxY(5) J • *AO0 7(?6>»F?WY(?)XY( 9)»F?xv( 1)»Vf <»)*f >xy( <. V«vi9)*f <>»Vf 6) I • 3M)0 7(?73«G?«Y(6)*r,?*Y(7)*r,3«.>( 7) i • 3«oo 7C?8)»H?xY(?) I « 3900 7(?9V«J?xY( ? ))♦ ,|?»Yf ?01xY( 7)« j3*y( S)x>( 70)4 *?»Y' 8)* |1«V( 1 )xy(9)* <| , I * 4000 7(l())»I?«yf?n)M'« , '(9!M(SnM>Y(J)iM')t * « 1 70(1>« 118,0 » ♦ «?00 70(2>« T».o I • «300 70(31= 3 /• . 1 * till 70(«V» 1*900.01 ♦ «*>00 70(5)= O7.0 » ♦ «600 70(6)« '9.0 ; * *TOQ 70(M« 4*0,0 I • •■00 70(8)» n,f\ J • 4900 70(9)= 1,0 J ♦ =-000 70(1")= •," i • Mao 70(1 1 )» 0.0 I ♦ ''POO 70(1?) = 3.S6«^ I * *-3G0 70(H) « 3«b67.M ; « S«00 70(1*) ■ 01 * *">00 70(1S) 1 II * SACO 70(1*) ■ .OOOOOOS : * *700 70(17) » 3.009H7*C>SSI • =.800 39 ?0(1«) ■ ,9M i 70(19) « «.675J 70(20) . 01 70(21 ) a J45.67«S(i-1 ; 70(2?) « 335634S.0 J 70(2-0 ■ 3.0> 70(20) = oj 70(2*) ■ 01 70(26) x o; 70(27) x us 70(28) = 01 70(2") s 01 70(30)= o; «U -l ,nt *2= -2.01 A3" .051 44« 0, 1 > A5» 0,5ft j A6« T. 141 5 J "1 » 1 .0! R2« -o.os; B3» -1 .SI "«» -o.otj R5» -o.o*/ CI" -o . re; C2x -O.O -5 i C3x -3,0 I nu 3.0 ; 02« -O.OOb j D3« -O.Oob | Fix 0.01 I * 5900 5rt * 6000 59 * MOO 60 * A?00 61 ♦ *300 62 * * '( 63 ♦ *500 ftk * 6600 65 * 6700 66 * 6800 67 * 1*900 6H * 7 00 6V * 7100 70 * 7200 71 * 7300 17 * 7100 73 * 7500 T* * 7*n0 75 * 7700 76 * 7800 77 * '900 76 * ROOO 79 * PI 00 80 * B ? SI * "loo 82 * MOO 83 * "boo 84 * "600 85 * 8700 86 * 8800 87 * «900 86 ko rim o.o' ; Tim 0.005 J r«» -0.5* t P5» -0.033 I Pta 0.03 l r2m 0.09 f rim 0.00b t P4« -2,0 J fit" 2.0 I r,2»-o.oi i r,3«-n.on i HI- 7.0 I H2« S,0«*V7H?/Sf T2«-O,01OU0 I T3»-O,00'>33 I J1--0.50 I J2»-o.00O5« j J3»-O,00'50 | Kl« 0.00058 | H2a 0, 00933 I K3--1.0 I K4«-3.5«697364l K5a -54S6,«85'i6» K6« -345.56Pre-0'l> KP« 4567'. 999 I K8« «5, 5*6*59 I :<9o /(566.4«0? I TO ■ I TP »1l TP »o.o»o,?5»o.s»u.rs» 1 .0 I STEP ■ 0,05 J • »0ftft •ft • 910ft *• • • •on ♦t • 9100 92 ♦ 9460 93 ♦ 0SI>9 • 4 • 4400 99 • 9700 ft* ♦ 9400 • P • «9ft0 9a ♦ 10000 ♦ft ♦ 10100 10ft • io?oo 101 ♦ 10300 102 * 10400 101 » 10500 104 ♦ 1O600 109 • 10 POO 10ft • loftOO 10P ♦ 10900 108 • 11000 109 • 11106 110 ♦ 11700 111 ♦ 1 1 300 lit • 11400 111 * moo 116 * 1M00 119 • 11 POO 11ft ♦ imoo IIP • 11900 HI ♦ 1?000 lift SPrEn-UP factor is b^.r CtlNGBATlJI ATIHNS, Nil SnuT.f PHHOWAM Fhwnws ,iFHF DFTFCIFD IN PASS 1 THF TnT,»|. IIMF i'«<; 1 MINUTES 1 "5 . * SFrriNOS, TIME F "■* 1M11MI 7AT I "in: TIME F 1 ^ SCANNlMi".! TIME Fin PAN«I«Gi TIME F n * SE»'*NTir ACTI'lNSt TIME fH FWRIIR RFCH»tKY; T IMF F IK OF H1IG P» I "IT T N r*. : MlNMI F <; 13.? SECONDS, MT>llT( * 11.") SECONDS. MlMlTfS ?7.6 SECONDS. MlMiTES 7C.9 SECONDS, D mtnhtfc o.o seconds. M I M I T E S 0.0 SECONDS. kl ENn 01?34")*7B9*l»?f>>*AflCnFFr,Ht. | » (<<->« JKLMNOH OR **-!!< /sruvwxv/ix 1111111 1111111 1 Mill! 11 1 1 111 111 11 111 J 1111111111111 1 111100000 1 11 111 111 11 11 1 1111 1 11 11 1 I 1 IH II II 1 ) 1111 '. 1 11 1 11 11 11 11111 10000 hi uii mi in ui 1 1 11 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 n ii ii ii i ii inn 1 1 mono PERCFNT *F AVAILABLE STn»AC,F USED IS 91. IS" 1 : l?tOO l?0 120 rA^,)S HTAD AT 00 CARliS 'MR 'U^nTr. k2 APPENDIX B ILLIAC IV ASSEMBLY LANGUAGE OUTPUT PRODUCED BY THE INTEGRATOR FOR THIS SAMPLE PROBLEM h3 r,,lr)f"S /MI'iASK I ISTFO -3Y SNAP IN APHtL 30. l9'-9 AT 20lJ9!44.tt * THF * « foiia t t ro I t HNFl HALF! Twm T All l .cnijMT . INC ! •TFMP? .TFMP3 .LFFTt ,N63« I ARIT^I F X P S « t KSi SUM1 ! 1UM2I SUM3I TNOEVl XKi XKTE»t ''RACt INTEl TSnMi TNOX1 t FNRl PONSTl RFGlN FDILDwrNT PK'lfjKAM WAS OPERATED HY TMF GFMXOAL Ul FFk RFNT I *L TIo\ SOLVFP I'Pnr.'jJM, n inIEOwaTFS THF TnpiiT FuIIAUONS USING URTM UROFR -ttlN'OF K'lTTA ■"FTMin. Pu„S F^u F*J F^O FOg F1J iFOj 1F«U Fog ir^j IFIJ FOj FOj FT L I Rl * FTLI Hi * FTLI Rl K FILL Rl K Rl * Rl * Rl < HI K 1 if FTLI nAiA mi a niiA niTA «l K PI < Rl * Rl K FTLL HI K Bl K FOu F*U DATA r'iOO'.) ■ 4000 = '400 '.1 = 'inon = 1 f) H«Sl j, . 4 •• : i >u? : bl 1 H I i » 4 V l »OS0 t S )S?i r i 'ir-uiw c u.ua. . , i u o o o o o o o o o o o (i | H | n 0l?59S2 r ;>?5;">2525?lM;i! >: 4 n c n o o i) .t o n o o u o i fl : k ?400oooooooo00uuo : * i 1 1600n0OO3')OOOO0()O|R>^ * i r f r r i vi! N 1 1 H H F R FLOATING PHINt HNF FLOATING ONE /SIXIH FLOATING iINF HALF Fl (1MINP, Ml] J I MF I1FI AYf I .5) RnnTTNG rnuNu 1 I f ' o » T I n N s >/AL'if« T n he STnwri) i ri i mirr l t *« r t M A S *. MF'-1"Hr li \ T A , 1 'H : 2 0: I *«> 1 II 1 > M ! * »i 5 : * 1 ">i * • >o; f I -1 1 :? 1 i* 1 '<% 1 -> fl ; * j» 1 •' 21»22 jfi. "> ■ 5 3 . S 1 >* 1 :* 1 :» 1 :» i < 4 ; 'i 1 o : * 1)1* s 1 f f • i = 1 777 H IJfM Cj =" KIJTTA PO'j^TiNTS SPRATPH STUtrAP.F f CR Fll'JMfNP, S'lMS 777 r/ rr ,'7/TTr n rrT tt\;r fnarlf hits i.l41S,n,i,-i.s.-O,O,l,-0,02»-1.0»-1.0»-34S,5677a-04» -). n OS. 41. S6nhS9» 0.02.-0. 033. 0.0 05 .0.03.0.00058, •'i.I)1,,"C.50»;).00?.j3»»1 . 0.-0.09 »-1.0» -0.0?* "1 «5» -s 4S6.u«5»4» -?.()» 3. o»o.ooos8»-n,no5»-i,o»o,oo5>o,03» •:?, 0.-0,01 .-0.50.-0. 00?*>ft» 3.14| ft»-1 . 0* #-0,0? » -f. 5469736 'I. -1. 03.-3. 5fl697364.-J.AS, 5677 »-04»-0. 56, -0,005.-0. 005.45. 566059, 0.02.-0. 033. 0.005, -0,01. -'i.noo^i»,-->.n. -o.O) ooo.o.o .O.n.o.o i Oo oni 00 00 on?00 00 on 300 no 00400 no 3PS00 no onAOO no ■)o/00 no onfloo 00 on 900 00 01 000 no OH 00 no 01 200 no 01 300 no 1<4 no 01 SCO no 1 is o o no 0' 700 no 01 100 no 01 900 no O'OOO 00 0?1 00 00 0??00 no 09300 00 0?400 oq O'SOO no 0?600 ii; 0?700 no 02800 no 02900 00 0*000 no 03100 ^0 Oi?00 00 0> iOO no O^'IOO 00 03500 00 O'AOO 00 01700 no oifloo 00 0^900 00 0*000 00 04100 no 0«?00 no 3 no O'lAOO no oasoo no 0*600 no 0«700 00 0««00 00 04900 oo o'-ooo no 0"i100 00 0">?00 00 0S30O 00 0^400 00 o^soo 00 05600 no OS700 1 2 s 4 5 A 7 H 9 10 1 1 12 1.1 14 16 \r 10 19 20 21 22 23 24 25 26 27 20 29 30 31 i2 33 34 35 36 37 38 39 40 41 4 2 43 44 45 46 47 4tt 49 50 51 52 53 54 55 56 57 kk ► ILL OAT A f ILL PAT A t a r, s i f 'LI UA T A F ILL llAT A i 1LI I I /. T A f ILL UA T A ^ ILL JATA 1 'h: j. ooosh. -p. i. n. v- •-'i.''i')»-').'M.-(i.o9,-j.^rtfto/36««-l.n# o.nn?l3»*n«0(»Si"l . i»n.ons»o.(.i.-?.n#-i.n#-3.bB6v736«# *3,n33>*o,oio.)fl«-o,oui>b«#*i.57*»j 7 i*m , - ^s , is » r*«-o* # »o • oil -•>, i>i i. ii. «■.» f7.<; >.'w • ". i >0 . ?. ')»"o. 3i» . os. a,nfc".i?«i . i o 1 1 S ** » - 1 ,"»f-rt,.)1ono,-o,0?»"3»*»M69?36*»"3»P» u. 03. i, io5 , 33f-o.o»i»"').nn'>,-l .".n.rob.o.u?,-?.n. - ). 1 t. , "II . OOnSn, -n , Sr, -0 . 01 00, 0. i>. 0» .0 i 1 >«* • 0. (J . . ii . o . I;« 0. 0» li . 0. u. :>»0. 0.0.0. 0.0. J II 0. i i< . i 1 . J . ' . • li . )> n»o»i i. ii, 0. > , » ( > ). 'I. . 1 ). II, 1 > i. , II r 0. i 'B i i . II . r. I.i » 0, i : 'I , 1 ■' . 1 1. I 0» ■> , \ , i , B 11 1 > i • 1. o# , • i • 1 1 \, i . 3. 0. '1 » 1 . -: ■> II . 0. 0, 5. i> \, A 9, 0» /J. 0. n . 0» il 11 0. 0, n. 0» 1 j* : il J . 1,'PI 7» H, ?3» 1 t , s, i : - 1 . -3. 1 . 9 . 'J . S. 10. <5. . -'I , - n ? , -> , M, •' 9, 1. 9. 5. Kit a . / 9. 9, 9. 9. ? , 'J. 6. 7, in. 10» 9, o . 1 1 u > P. 9. 3. •*. ?0» 0# 0. 0. i)» - 1 . - . - 1 . - 1 . o . 1 . "3. -?t -1 , 3 . ?t i . 1 . a. 3. , -3. -? . - 1. -?» -3. > 1 • ?, 1 , ?. t > . -1 . -?» -1 . ?. 1 , . 1 . 0. m \ t o# -1 . 1. 0» 0. Of 0. il' Ai- nu DATA 1. ># .1. 3. 0. •), B . i. o. 1?. 0» 0. 9. S. ■\, i# ?0, 0. Hi i). u ill 9. 0, 9. 3. 1 . ■> , -> , A , 1. " . 0. '•) . 0, 0, 0| ] ">«; 0» B. ?i 0» 0, 0. B, 0. u. 7 > 0.10, 0, 0. 0# fl> 0. 0> 0, 0. nu 0**oo no 0*700 ii/ 0*B')0 no 0*900 n o orooo no r,7 1 ijo no o7?nn Oo 7 300 "u n'aoo no (•■",00 no n'»on "0 7 700 no 71,-10 ' 7 900 no 0«noo "0 l)"1 00 "I r "POO rtjj )" 100 no B 4i)0 no n«soo nil ()«6O0 no 0«700 no ()"B00 no e 900 no 09000 no 09100 0(1 09?00 0|J 09300 no 09«00 no 09S00 no O°(SO0 no 09700 no 09«00 no oogoo no inoon no inioo no lo p()0 no 1O300 00 1 H 00 1 nsoo no 10600 no 10700 00 lnrtoo no K900 no 1 1 noo no 1 1 1 oo no ! 1 Poo no 1 1 300 no 1 1 400 no 1 1S00 no 1 1 600 no 11 700 00 1 1 BOO ba bv 60 61 •>> 65 6<. 6b 66 67 68 6y ' 71 7? 73 74 7b 76 77 78 7V BO 11 »2 *3 Hi* Bb B6 87 BB 89 90 91 9? 93 94 9b 96 97 90 99 100 101 102 103 104 10b 106 107 10H 109 110 1 11 112 1 13 1 14 115 1 16 117 118 ^5 FILL OATA FILL OATA FILL OATA FILL DATA ^ TLL TNCS» ro'i rnijN«i ton oelti HATA TOl OATA LEFTS! FOH BOIINSi F.OU HI K *Tl DATA t# J. 9, it* 9, •>. *>> s. 7, s, OJ 0, 1 9, 15, ?r . 9, ?, <■>, S, 7, ft. S, 1 . 9, 'I. 3. 7, 3(i 9, 20. 9 • 10. 9, PS. 0. ?. 1'). S. 7, 7, 1. 1 1. 9. a, o. a. 3. 8. 3. 0. ft, 1?PI •'I. 7, ft, •I. )• •), 1**1 a. 9, 1 . 1. >, 1, »?e: o. o# i» -), n, 0, 1 ?8J • 1, *1 • f n.ns f o. i «; 0) o, o. 1 n, ftf a, ft, 17, 13, H. 0, 9, 9, 9, r. ' » *. 7, 9 , OJ 0, 0, 0. n, ft, oj s, ?*» 1 ft. 10, 0. I). ft, 0, ft* 7 , ?o. 77, 5. ft , 0. 0, n, o. 2. 0, ft, 3. o» 9, 7, 3. 9, 7 . o, o. 0. o. ft. 3, ft, 9, 10, ■3, 8, 1, 7, '/», v. s, 'I , ■># 0, 0, o. 0, 0. 7, 9, «, 9, 0, 0, 0, 0, ft , 0. • 1 « o > 29, rt * S, «"1 . 0. ft, ft, 0. o, 3, 9, it 3, 5. 7, 9, 5» 0, 0. n, n, 0, n, ft , 3. ft, 8. 0, 9, '), ft, 9, 0, ()» 0. 0, 0. 0. 0.01 7 ft, ft \ 16" I 1MI 9, 9, 1 , 3. 0, 0, ft, ft, 9. 0» 7, 1 . 7, 9, ft. ft, ft, 0. 0, 0, 5. 3. 7. a. 0. >» - 3. "I, -3. -•>. -1 . 0, 1 . 0, 1. 0. 1. )» -1 , -', -3, •P. -3. -?. -3. -2» -1 » ft. 1 . ). 1 » ?. 1 . 9 , 3, 2. 1 , 0, -1. ft, -1 . T. -1 , ft, 1 , 2, 3, 7, 3. 0. 0. ft. 0. ft. 3. 1 , ft. ft, 9. 9. 7-1, 9. 0. 0. 0. 0. 0, 0. Ul K I STARTi Si I T t 1 ) »63J* ST|_(1 ) .■g63i« I'UO) bFNM * L^EFKO) 1 Met)) :1Hin, * S T L(05 .•3011*10, 1 1 TtCO) = 0^ 1* = l . 9,o; / Il«.n,n.7«.o,'i,3r.o»o.io9()0.(i.o.o7.o.o»?9.o.o» a it). "» 0*0. 0.0 * 1 . f>»0»«». ')•() » 0.0.0, 3.5ft<»S»0> 34567.89.0. , i. . 0»1,»0». 00,10005. 0.3. nu9rt7*>ftS5»0,,9» 3 >0»ft. 675.0. u.»O»3ft5,A7'i5»-1-O»335iS3«5,n,O.3.0»O»O,»O»0.«0»0.»O» ti..O,0,,0»0.»0.0..0»Ui 7-ftf * » * IJSt AS MASK .1 f| t FTxi/Tr,HT#n 00 1 1 900 no 1?000 00 t?100 no 1?200 no !?300 00 l?ftOO no 12500 no |2600 00 l?700 00 1 2800 00 I 2900 00 I 3000 00 1 3100 00 1 3200 00 1 3300 no 1 3«00 no 3500 fto *600 fto 3700 oc 1 3800 ft^i 3900 no 1 >i000 no I ft 100 no I /.'VOO 00 1 i300 00 'iftOO no asoo 00 aftOO no >i700 fto | 1«00 no 1900 00 *000 no 1100 00 [S200 "0 S30n 00 SftOO 00 *S00 00 moo 00 S700 00 1 1800 00 S900 oo *O0O no 1 MOO 00 1 *200 oo 1 MOO 00 1 A400 no 1 ftSOO Oy 1 *600 no i *700 00 1 AHOO 00 1 *900 no 1 7000 no i 7100 00 1 7700 00 1 7300 00 1 7 4 00 no 1 7S00 00 1 7*00 00 1 7700 00 1 7800 no 1 7900 119 120 121 122 123 12ft 125 126 127 128 129 1 30 131 132 1 33 13ft 135 136 137 138 1 39 140 141 1ft2 143 144 14b 146 147 148 1 ft9 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 kC l.OA STA sta STA * THTS t TWF NEWT t l t r r 3 ) SUM) Mr xT S£MT F L I fl rt I M fi i ntii STLf 3) SI IT(?) rvcHi.(?) SKIP I 0A Ml KM a^hn 4N Ml ^ AORN STA SI IT(2) TVCHLC21 S^IP L^A STA Ml HKJ A1HN STA I ^A Ml KM AH^N sta SI.ITC?) FVCHLC?1 S"IP L 'U J1KW l_TT<2) Ml i)M AHUM Annpg Ml WNJ I TT(1) Ml KN f coi « XSJ» XT IT x r - 1 f •/ XK| I XKTf M ; | I MO n r PAT A IT AH I M^. , =rnnNs; ■ c n 1 1 N T I UN IS liSrr, H \ >>F|.A> [S USF|I. (Y(T-TAU> ) PANT (If THF PBIfSWAM IS TrtF UuHr.FMIT T A C ALC Ul A T I MNS . TffriPVjl .......•.....*.*********.....*. s ; * s |MJ? b I M 2 ! ! J F I T ! ' i C 1 1 * x'< i I X < T f m | % z y v a 1. 1 * t rrP i r , nt b ■ i m : * S'l"3 l * jF| t : i x< ; < >;tf '« : * jF|. T » * f.M t Ml* Til* rFVAl. ! * » I C N I * » o; a b')M2 1 f. S |M3: « cTWfU Z *C2t t S IM| : » s i m ; « iCl T»» = i)SX 'hi t »ci a C" A I I SMhKHiiT I 'IF Tn FwALUATF F(X»T). s 1 1 h m n 1 1 t ] fjF uRAMt i{ it:'»i'U'»?(T, rri ri = so'mrnn i nf < *i i rn r!NO <2 T r M i : > - S jiUHITlKF T»\|.| Til i;ft k3 xktfm:= xiv ♦ | : F L T x K 1 SIHU'lUT F NL CAM. Tn r.FT "A 3A00 00 23000 00 2«000 MO 1H1 112 10J i rt* STA >*Tf M|» TVLTM(2) »NFWT;* STA H(3)l» L"L(?) .1NC»* LOLO) .TFMP3J* T*LTM(3) #NEWTI* jump uo>* ****♦*****♦*. S«IP »0|l OF HLINGE khTTA CALCULATIONS ROUTInF IF NOT TO BE SAVFD STEP LOOP uNTll FNn i FNn I * REMNNTnG nF F'iNCTI FVALI EVCHL(2l *02» FVCHi ( 31 S03J STL(O) *ni 1 STLfl) *D1? Si 1T(2) sROT EVCHLC2) SICP. S«IP #oit L^A «0)l STA SUMI EFTJI STt(?i JHM L n L(2l tHIH' L n A « I » INTTI CSUPC?) .LEf STA TSUM r»0P(7) .LEF T v L T M ( 2 ) i I M LOA Cnw Lnx TAG mi rn «xs LOX TAf. M| RN *XS Lnx TAf, AnRN «TS STA »TS L n A CON inx tag MIRN *VS L X TAG ML RN «XS L"X TAG AORN *TS STA *T<; LOA CPK' Lnx tat, MLRN »xs Lnx TAG MLRN *XS LPx TAG AORN »TS STA *TS L"L(3) .MOU FINSI LnL(2) 4C3J CSmR(7) .lff Lnk TSU" CANDf?) .\I63 RTL 0(71 A^RN SRI* on evaluation subroutine. * SAVF CAMS * ;» M Fl» CALL HOTF I* * r NO J Tl* (?) t;* TJ* ST* S + J *4 I S + MM; i|"; sr + s* s« » UM| ST* s + l|U| mm; NO»! t Tit (?) I t I * I J oi 1 ; 2i 1 : 3l 4 : 2i M 7) 7FRU SUM RANGF Of RnnTES I.HAnFO 7FRn Ol'T TSliM | OCATjriNS ; ) 7ESO A f-EGlSTEHS ROUTING 40UNn SUR I EFT MAX TO GTT NEGATIVE ROUTES RG«» PAKTI»L SUMS MOST 00 ROUTE OF 43 TNSIEA0 OF -1 ROUTE PARTIAL SUMS TO CORRECT PF ADO TO A REGISTERS 00 7*100 no ?«?00 00 7«300 00 ?4400 no 7A500 no ?«o ?moo oo ?*?oo no 7=300 "0 7 = 400 00 ?>:500 no prison no 7=700 no ?=800 no ?c900 00 2*000 00 p«100 no j*2oo no ?*3oo no ?aioo 00 7*500 00 7*600 00 7*700 00 7*800 00 7*90(1 00 27000 no 77100 00 77200 00 ?7300 ^0 77400 00 77S00 no 77600 00 77700 00 77800 00 77900 00 2*000 no 7«ion no ?«200 00 ?M00 00 ?8400 00 7«500 OO 78600 00 7*700 no ?«8oo no ?«900 00 70000 00 »M00 00 70200 00 ?9300 oo ?oaoo 00 70500 no ?9*oo 00 ?9700 00 7°800 00 ?9900 00 30000 oo 3nioo 241 742 7 '4 3 244 245 246 247 248 2'4V 250 251 252 253 254 255 256 257 258 25V 260 261 262 263 764 265 266 267 26B 269 270 271 772 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 799 300 301 % fno t ROTEi »OTXl VLTMC 3) TA OL(?5 VCH|_( 35 UMLC25 OL(O) "L( 1 ) *LHL(25 KIP f UNCT TP OH fTF "A TTf 1 ) Ox TH TL ^A HM Mp vltm( n MfO) VCHl (?5 kf IP .F I*S| l S'JMI » SH'I i 107 i * 10311 *0?l I 101 1 1 t 1 51 ?l * * I rn i * *..«.»♦..». » o 1 1 V F.VAI MAT 1 0\j S;l WONT I NF . L'lnj.^doTF wrr.Mi 1 plus last RF5HLT 1^ SUM X K T r M I * L.I1H.-F I* 1 Nil! X I » = hjnxi j * 4A t« • R[1TX I % i|rp| f .01* this m*fs tmf valufs sturfo acjmss tmt pf . s in xktf" ANO DI sTUTRnTF « THFM \ M T fl T*F Ml LIIPATIDnS rnilOilld", XK, * I TAIIRI Bfr.iNNfNr, of pojtinf 1 1 Rrruvf4 '.Mines is n^»v rs 1 1 s f: . TTC35 *f77?rT7tmt OL(05 . Ti'MP?,* AN0(05 iCHt vjn nul | AST ?4 'i T TS OF COUNTER T T( 3 5 =OflOOftOOO[5000f)0.5nO')(IOOri I H It OH(05 4C3I * L f 3 5 i T F M P % I 1 "X GOl v. -F. . ANn.F ; * -T ,nn.i i* THhC ;* i r n i * * AMI STORF Jv yKTEM I* RF-FNAII F ANjO HrTliHM *T( PI * » I T Oo }n?00 no jo *00 oo jn«0O 00 josoo 00 jn«,oo "0 jnroo 10 ?nnoo 00 31900 00 11 000 ny >i 100 00 H 700 00 11 300 ou j< aon "0 31500 nq 31*00 00 >' TOO no >1 AOO 00 31 900 oo 39000 00 > •> 1 U on 19900 no 39300 00 19000 oo 19500 00 19600 oo >9700 n ■ »9K00 no . 19R00 no )1000 no 3*100 no f>?00 00 n3oo 00 »->*oo *0 . n5oo 00 (1600 oo nTon 00 . n«on 00 ' n9nn 00 |1000 oo . |«100 no |A900 no ■ |«300 no • ia«on no ■ asoo no • |A6on oo ■ i«rnn no ■ 4800 00 1 49on oo ■ i^ooo no ; ^100 no ■ 5900 no : S300 00 J s«oo no ■ ss,oo no - S600 no \ sroo 00 1 IROO no 3 5900 00 i 6000 no s <100 30? 30j )0« 305 3 .6 307 3013 309 310 311 31? 31 i 31« 319 316 3U 318 319 3?0 321 322 323 32« 32b 326 3?7 3?h 329 3 30 331 332 333 33« 335 336 337 338 339 3«0 3«1 3«2 3«3 3<44 349 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 ^9 APPENDIX C BNF INPUT SPECIFICATIONS FOR THE INTEGRATOR (INPUT TO THE TWS SYSTEM ALONG WITH SEMANTICS TO PRODUCE THE INTEGRATOR -TRANSLATOR. THE SEMANTICS ARE TOO LONG TO BE LISTED HERE) OH Of S /SY'UAy 11STF0 n y snap IN APHIL 10, 19*9 AT ?0 t 191 4 3. 3 /r.nnrs PRINT ALI I FPARSERJ «Y.T.^ I «» I » a l I fSTEP = I I I I I* I 1 = ( t = I t a I ! t a I AHI 1ST, TATf i / s *t »n y ins try »i< *N> TFPM acth f>p II= <* < s I r. n > 1 1 = ! < TUN f T > l I I i < I kj I T > i in t « i ; : ITP < T F I M A L > » : 3 If / - + /- 3 < #TPs c<»r U r 3 <* U .70 > *H <> ) rgN> < rrfi IIH> 1 < C tin F3UG f-k r 3 p u r r ii »14 o> I < /- » f MAkY / n/t I KUM> / t M> ,U * \l> * ?/ «.' I "> #1? o fc, X > ) S K,N> < s I r. n AUMliM «4 > F N , I n r, .TT.TP,CUS,Fi IS J • in/ < I M 1 T > »H / / »?S/ TNT> «10/ OS / »40/ 11/ »Fno #1/ ffuiWlP = < P f A | N 1 1 M > I'll ) *»[<> it, 'j/ eciNTRUi > «ii ; mr// »TFM »'•>/ iniMpHf »!«/ #RUNr,F *37 / <*N> #t4/ / rrniH TNSFHT r <»N> lib/ »FnNMnnE fbf F f K *S« / irn»'M!UP »S?/ »P*TTF*N #60| X > 1 n s < r X k > J /•rA 1 / ?7 > $■■> ■>■? I t ? 1 *? » ; <> ; f < fiP> jjl / <1[0m> »)? I »?oi * / * ru« f ? u I /<.1> |1 9/»Y( ) -1K /*T «17 /fla <>'*>'>) »1fl I PArm»'> *S(|/ . <*•!> #S6 >PnN( vl> * S 4 s / <> I > .: * m > » s > / <> ; P > i » l 1 : p S ml / -i fit, « 9 / «fy" * s I 1 P, M X R F 6 1 N 1 1 M > 1 LVI|M> ! falnu'"*- *"*/ , <«lalnim> ,?/ ft, t RFAL KJM>: i no onion no 0n?00 no 00)00 no 0O400 00 n ft soo no On^uO no 3*700 00 OOAOO 00 30900 n 31000 no 01 100 no 01 ?00 no 01 300 no 01 «00 00 01S00 00 01 iSOO 00 01 ^00 no 01 *00 no 01 900 00 07000 n O'lOO 00 0??00 no 0?300 oo 0?A00 00 O'SOO no O'^OO 00 0?700 oo 0?800 00 0?900 00 o^ooo 00 01100 no 0'?00 00 01300 00 01100 00 OiSOO 00 0>AO0 00 oiroo 1 2 i « 6 7 6 V 10 t 1 \i 13 1* IS 16 \1 Id 19 20 ?1 2? 2i 24 ?S ?Q ?? 26 29 30 31 32 33 34 3b 36 if 51 LIST OF REFERENCES [l] Barnes, G. H., et al., The ILLIAC IV Computer, IEEE Transactions on Computers , C-17, 8 (August, 1968) pp. 746-757- [2] Gear, C. ¥., The Numerical Integration of Ordinary Differential Equations of Various Orders , Argonne National Labora- tory, ANL - 7126, January 1966. [3] McCarthy, T. E. , Solution of Ordinary Differential Equations Related to Metabolic Systems , Department of Computer Science, University of Illinois, July I968, ILLIAC IV Document No. 197. [4] McCracken, Daniel and Dorn, W., Numerical Methods and Fortron Programming , John Wiley and Sons, pp. 317-329, New York 1964. [5] Nordsieck, Arnold, On Numerical Integration of Ordinary Differential Equations , Mathematics of Computation, 16 (77-80), pp. 22-49 ( 1962 ) . [6] Rosen, Saul, Programming Systems and Languages , McGraw-Hill Book Company, New York 19©7 • UNCLASSIFIED Security Clessificstlon DOCUMENT CONTROL DATA -R&D [Security claaalflcallon of tlllm, body ot abatract and Indaalng annotation mtiat ba wlwJ tha ovaralt raport la elaaalllad) originating ACTIVITY (Corporate author) Department of Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois 6l801 2*. KPOUT IECURITY CLiisiFICATIOH UNCLASSIFIED 26. 9HOUP 3 REPORT TITLE AN AUTOMATIC INTEGRATOR FOR ORDINARY DIFFERENTIAL EQUATIONS FOR ILLIAC IV 4. DESCRIPTIVE NOTE! ( Typa at r a p ort and Inetualwa dataa) Research Report 5 au thosisi (Flrat noma, mtddla Initial, laat nam*) Thomas Edmund McCarthy «. REPORT DATE June 1, I969 7*. TOTAL NO. OF PACES 57 7b. NO. OF REFt ta. CONTRACT OR GRANT NO. k6 -26 -15 -305 I b. PROJECT NO. US AF 30(602)lklM •a. ORIGINATOR'S REPORT NUMBERIS) DCS Report No. 336 •6. OTHER REPORT NOI3I (Any othar ntmbara that may ba aaalgnad lb I a raporl) 10. DISTRIBUTION STATEMENT Qualified requesters may obtain copies of this report from DCS. II. SUPPLEMENTARY NOTES NONE • 2. SPONSORING MILITARY ACTIVITY Rome Air Development Center Griffiss Air Force Base Rome, New York I3M+O IS. ABSTRACT This report describes the design and implementation of an automatic integrator for systems of ordinary differential equations for use on the ILLIAC IV computer system. This integrator accepts a simple statement of the differential equations and their initial values and produces an ILLIAC IV Assembly Language code which can be used to solve them. The intent is to produce a code which will be efficient on a large parallel computer. )D , f n°o?..1473 UNCLASSIFIED Security Classification UNCLASSIFIED Security Classification KEY WO FID! ordinary differential equations automatic integration parallel computation ILLIAC IV UNCLASSIFIED Security Classification 1 i§3