LIBRARY OF THE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN 510.84 Ittor fto. 293-300 CENTRAL CIRCULATION AND BOOKSTACKS The person borrowing this material is re- sponsible for its renewal or return before the Latest Date stamped below. You may be charged a minimum fee of $75.00 for each non-returned or lost item. Theft, mutilation, or defacement of library maferialt can be causes for student disciplinary action. All materials owned by the University of Illinois Library are the property of the State of Illinois and are protected by Article 16B of Illinois Criminal Law and Procedure. TO RENEW, CALL (217) 333-8400. University of Illinois Library at Urbana-Champaign JIM 2 81999 ff 6 1 AM When renewing by phone, below previous due date. write new due date L162 Digitized by the Internet Archive in 2013 http://archive.org/details/illiacivquarterl294univ <-J ' Report No. 294 ILLIAC IV QUARTERLY PROGRESS REPORT July, August and September, I968 Contract No. US AF 3O(6O2)l*0M ILLIAC IV Doc. No. 206 ILLIAC IV QUARTERLY PROGRESS REPORT July, August and September I968 Contract No. US AF 30(602)lklM Department of Computer Science University of Illinois Urbana , 111 ino i s 61801 November 1, I968 This work was supported in part by the Department of Computer Science, University of Illinois, Urbana, Illinois, and in part by the Advanced Research Projects Agency as administered by the Rome Air Development Center, under Contract No. US AF 30(602)toM. TABLF, OF CONTENTS 1. Report Summary 1 2. Hardware 3 2.1 Diagnostics 3 2.1.1 PE Logic Simulator 3 2.1.1.1 Generation of PE Logic Simulator 3 2.1.1.2 Level Assignment and Loop Detection .... 4 2.1.1.3 Application to Logic Debugging 5 2.1.2 Generation of PE Diagnostic Programs 5 2.1.2.1 Path Tests 5 2.1.2.2 Combinational Tests 6 2.2 Design Automation 7 3. Software 8 3.1 Translator Writing System and Language Development .... 8 3.1.1 Introduction 8 3.1.2 Syntax Preprocessor 8 3.1.3 Parser 8 3.1.4 Twinkle 9 3.1.5 twst/tbnf 9 3.1.6 TWS Semantics - ISL Translator 10 3.2 Tranquil 11 3.3 Glypnir 11 3.4 SQUASH 11 3.5 System K 12 3.5.1 Introduction 12 3.5.2 Assembler 12 3.5.3 Simulator 12 3.5.4 Loader 13 3.5.5 OSK (ILLIAC IV Operating System) Development ... 13 3.5.6 Interim OSK Features on the B5500 13 3.6 CAT 14 3.6.1 General Compendium 14 3.6.2 General Optimization 15 4. Applications l6 4.1 Mathematical Applications ,.....,. 16 4.1.1 Partial Differential Equations . l6 4.1.2 Ordinary Differential Equations ..... l6 4.1.3 Alternating Direction Iteration Scheme 17 4.1.4 Hydrodynamic Codes 17 4.1.5 Boltzmann's Equation • 18 4.1.6 Matrices 18 4.1.7 Eigenvalues 20 4.1.8 Root Finding 21 4.1.9 Special Functions Subroutine Library . 21 4.1.10 Long Codes 23 111 k,2 Linear Programming 23 4.3 Radar Processing Applications 2k k.k ILLIAC IV Education ■ 25 REFERENCES 27 iv 1. REPORT SUMMARY The ILLIAC IV Advisory Committee Meeting was held at Burroughs Corporation, Paoli, Pennsylvania, on July 17 and 18. Personnel from Rome Air Development Center (RADC), Advanced Re- search Projects Agency (ARPA), Burroughs, Texas Instruments (Tl), and the University of Illinois attended the meeting. Also present were the members of the ILLIAC IV Advisory Board and of the Advisory Committee to the National Academy of Sciences. on the NIKE-X System. Technical presentations were made on the system design, the programming activity, the diagnostic system, and the application of ILLIAC IV with phased array radar. Laboratory demonstrations of the breadboard PE and of the thin film memory system were made. The concensus of opinion was that the meetings were excellent and pre- sented the attendees with a detailed report of the ILLIAC IV Project. The B5500 installation continues to operate well. The use of a dedicated machine has had tremendous impact on the University's effort to develop software. The software effort would have been hindered if the machine were not available on a dedicated basis. Additional magnetic tape drives and disk modules have been installed. Moreover, the predicted load from the design automation and diag- nostic effort has resulted in the University ordering an additional processor and I/O equipment. The data link between Burroughs and the University has been installed, and the Mohawk data terminals provide tape to tape transmis- sion. Design Automation data which is to be run on the B5500 has been transmitted to the University by the data link. The major diagnostic effort is the checkout of the bread- board PE. Test programs to run on the PE exerciser have been pro- vided by the University. The checkout efforts have been going slower than anticipated, and efforts have been initiated to speed up the checkout of the functional logic. - 1 - In software, several areas showed progress. In the ISL translator, improvement in its speed and several program sophistica- tions were made. Changes were made in the syntax preprocessor to increase its efficiency, and the complete documentation of the syntax preprocessor was begun. The CAT language project was divided into four areas to increase its progress and development. The supplemen- tation of Glypnir Version I was, with the exception of procedures, completed. A Version I user's manual is in preparation. Progress was also shown in the syntax and semantic description of SQUASH, the debugging aid to ALGOL. The ILLIAC IV education consisted of organizing and present- ing a course of instruction for ILLIAC IV personnel. The subjects discussed by this course were: ALGOL, the B5500, the ILLIAC IV assembler, Tranquil, and ILLIAC IV applications. Mr. Richard Stokes of Burroughs has been replaced as Deputy Project Manager by Mr. Walter Fresch. The University has approved the replacement. The major problems are : Debugging of the PE breadboard and obtaining artwork for the PE and CU printed circuit boards. As mentioned previously, the University will cooperate with Burroughs in the PE debugging by use of the PE logic simulator. The artwork production is being followed very closely, and weekly reports are being received from Burroughs. The University's B5500 facility is available for the Design Automation Activity from 12:00 midnight to 8:00 A.M. nightly, and additional time is provided on weekends. The availabil- ity of the computer time and quick turn-around will help the DA effort. The University has requested Burroughs to use the University's computer to reduce the cost of running the DA programs. The funds from APPA will carry the project to mid-March. - 2 2. HARDWARE 2.1 Diagnostics 2.1.1 PE Logic Simulator 2.1.1.1 Generation of PE Logic Simulator The PE Logic Simulator (or the Test Simulator Program) was improved and completed this quarter. The simulator takes two-ten seconds of the B5500 processor time for one clock of the PE. The simulator body is, basically, a set of ALGOL procedure statements whose identifiers correspond to package types while the actual parameters correspond to the signals incident to the pack- age. There is one procedure statement for each package, and the order of the statements is defined "by the level assignment of the packages. Because there are several looped packages, it is impossible to follow the logic in one sequence; therefore, there is a program loop for each set of the looped packages for evaluating the logic equations until all outputs of the packages are stabilized. If, after thirty times of repetition for the looped packages, the outputs of the packages are not stabilized, a message will be printed out to indicate a possible race condition. A few problems were caused by the usage of a compiler lan- guage as a simulator media. Since the number of words in a program segment cannot exceed 1023, some redundant words as BEGIN, -FORMAT, and END were inserted to divide the segment. About four minutes of the B5500 processor time was used to generate the simulator body, which is in a file containing approximately 1+000 card images. An editing program combines the generated simulator body with the procedure declarations and the input/output processing pro- gram. The procedure declarations describe the logic of all the pack- age types in ALGOL. The formal parameters of the procedure corre- spond to sixteen or eighty pins of the package. - 3 The input to the simulator may contain specifications of a micro sequence and/or data to be put into the PE and special symbols to control the content of the output from the simulator. The content of every register and the signal values of some combinational cir- cuits, as well as an array shoving the state of all the inter-package signals for the convenience of logic debugging, can be printed out by the use of control symbols. 2.1.1.2 Level Assignment and Loop Detection Before the generation of the simulator body, a level had to be assigned to each package. This was done by referring to the arc list which is a reduced form of the wire list. Since the level is not assigned to a package by the logical equations but by the arc list, some packages constitute a loop; therefore, to assign a mean- ingful label to all packages, the steps listed below should be followed . I. Assign the level to the packages in the ascending order and in the descending order. Extract the packages whose levels are not defined — these packages may constitute loops. II. Distinguish each loop from the set of extracted packages. Looped packages are reduced to one pseudo -package . III. Assign the level to the set of pseudo -packages, and after assigning the level, expand each pseudo- package to the original packages. All packages are converted to the compact symbolic names within these programs to increase the speed of level assignment. There are several programs for this conversion. The simulator can treat up to 2000 arcs and 500 packages with up to 1000 arcs involved in loops. There are about 1800 arcs and about 1+50 packages (including some dummy packages) in the PE. The number of arcs within the looped packages is 156 and the number of packages is 482. The processor time for these programs is four minutes for the first step, thirty minutes for loop detection, and four minutes for the last step. - h 2.1.1.3 Application to Logic Debugging The simulator has been used to debug the logic design and the wiring list of the PE. The basic transmit and arithmetic in- structions were tested on the simulator, and Burroughs was informed of the detected errors. 2.1.2 Generation of PE Diagnostic Programs 2.1.2.1 Path Tests The last two subprograms of the Test Ordering Program (T0P) were written and debugged. The algorithm used in the ordering of test cases is based on the evaluation of a weight for each test. It has a tendency to prolong the chains of "success" branches compared with those of "failure" branches in the test tree. This algorithm is expected to be more powerful than other methods in locating multiple errors. The following are major files generated by the T0P. I. T0P/NBRTWIG : It contains an integer which indicates the size of the following three files. At present this integer is 83I. II. T0P/PTHHAME : This file specifies the path to be tested or the failure location (at the termination) . This file is a one-dimensional array. III. T0P/SUCBRNH : It is a one -dimensional array. The value tells where to branch if it is non- zero. Zero indicates a termination of the tests. IV. T0P/NODELBL : This file is also a one -dimensional array, and it indicates the branched-to location. Some tests are labeled. The path test sequence can be specified by the previous four files. Some additional information about failure locations (e.g., equivalent failure location and set of equivalent multiple failure locations) should be referred to when the path test sequence is used. An additional program is being written to combine these - 5 - files into one file having the format of the PE Exerciser Test Assembly language. The output of the Path Test Generator has been reviewed In the input and in the program, a few errors were removed. 2.1.2.2 Combinational Tests Some final additions and modifications were made to the Combinational Test Generator during this quarter. The expected response procedure for BSW was added, and the procedure for CSA was modified slightly to perform the expected response calculations for MSG and MDG tests. All expected response procedures are now operating, and few additional modifications are anticipated. The documentation for CTG was also completed during this quarter. Several programs were written which translate the output of CTG (in PEX assembly language) into a form suitable for input to the PE simulator. An old version of the PEX Input Generator (PIG) was rewritten to facilitate the use with the combinational test record. This program translates control signal names into a fixed format for the PEX Test Assembly Program (PEXTAP). Another program, Assembled Code Translator, written during this quarter is used to convert the machine code output of PEXTAP into the form required by the simulator. With these three programs (PIG, PEXTAP, and ACT), any program written in the PEX assembly language can be run on the PE simulator. Running of sample combinational test programs provides a means for checking the programs themselves, for checking the ex- pected response calculations in CTG, and for possibly detecting design errors in hardware. A large sample of the CPA and ADA tests was run on the simulator. These runs indicated that the CPA tests, CPA expected response calculations in CTG, the simulator, and the PE were all functioning properly. In the case of ADA, the expected response calculation in CTG and the polarity of one signal in the PE were found to be in error. With these errors corrected, the address 6 - adder tests , CTG, simulator, and PE were also in agreement for ADA. Presently, a sample of the Barrel Switch tests is "being run. It is hoped that future runs of these programs will aid Burroughs in its debugging efforts and will serve to eliminate any errors in the combinational test programs and the expected response calculations. 2.2 Design Automation Final specifications for the Delay Check Program were com- pleted in the first part of this quarter. The program was written and tested on small test boards here at the University. Final debugging and testing are held up until the Post-Processor Program is opera- tional. The work to make this program operational will continue into the next quarter. Arrangements which will enable Burroughs DAS production runs to be done on the B5500 here at the University are nearly com- plete. A small software package has been written to convert program data files into tapes for the Mohawk Data Set and for their recon- version. This operation should increase the use of the B5500 here at the University and give the Burroughs group added machine capability. Work has begun on a preprocessor program for an ILLIAC IV Design Automation System. This program will input the data to the program, convert it into a suitable data file, and check for user syntax errors. The program will also prepare useful reports for aiding a designer in verifying his program input, logic diagrams, and equations. 3. SOFTWARE 3» 1 Translator Writing System and Language Development 3.1.1 Introduction Progress was shown in many areas during this quarter. Some of the progress involved changes which increased the efficiency in programs and the development of a new language. Improvements were made in the parser, and there was also progress in converting the parser instruction table to an ALGOL program. The ISL translator, during this quarter, was improved; and the Pass II of the Tranquil compiler was "begun. The following paragraphs discuss these and other areas of progress. 3.1.2 Syntax Preprocessor The implementation of minor changes in the syntax prepro- cessor continued. The purpose of these changes is to increase the preprocessor's efficiency. A line "by line rewriting of the code which is also taking place will increase efficiency. The inception of complete documentation of the syntax preprocessor has "begun. The initial emphasis of this documentation is the attainment of a detailed description of the algorithm used in developing the syntax preprocessor. 3.1.3 Parser A much more efficient version of the parser was implemented and debugged. All the indirect addressing was replaced by direct transfer addresses. The parser instructions were expanded to six bits allowing new, more efficient parser instructions to be generated and allowing the use of stream procedures for fetching or testing parser instructions and operands. The new parser has run from 100 to 750 cards per minute on test programs of various complexity. - 8 - The new parser which implements more meaningful error and monitoring messages has an expanded error recovery scheme which actually corrects errors in certain situations. By having many tests and sorts performed in the syntax preprocessor, the testing done at parse time is kept to a minimum. Work is nearly finished on a program that will convert the parser instruction table to an ALGOL program. This should add another significant increase to the speed of the parser since most procedure calls and array accesses will he eliminated by this approach. 3.1. h Twinkle Another area of effort was the description and implementa- tion of Twinkle, a syntax description language. It will not only combine all the features of present syntax languages used by the ILLIAC IV Translator Writing Systems but will also contain additional features. A TWS generated recognizer for this language will become the syntax scanner for the syntax preprocessor. 3.1.5 twst/tbw A new version of TWST which translates from TENF (translat- able BNF, i.e., extended BNF) to Burrough's code has been completed and seems to be reliable. It offers a two-three fold speed increase over the table driven version. A precedence notation compatible with the scheme has been planned but not yet implemented. Tapes and pro- gramming manuals for TWST/TBNF are available. - 9 - Schematic diagram tbnf 3.1.6 TWS Semantics - ISL Translator During this quarter, a high priority was given to the im- provements in the brute-force ISL translator in order to make the task of the high-level language groups easier. The speed improvement effort was successfully completed. Several program sophistications were introduced in the translator which resulted in a speed improve- ment "by a factor of six. The present brute-force translator can now operate, on the average, at 1200 cards per minute of processor time. Also completed was the addition of control cards to the translator. Work is now continuing in two areas: 1.) The TWS-ISL translator is being completed, and pass-n facilities are being added to both translators; 2.) Being started is a documentation effort whose goal is to produce, as soon as possible, an ISL user's manual containing examples of semantic descriptions of programming lan- guages. - 10 - 3 .2 Tranquil The Tranquil work for this quarter involved three areas. One of the work areas concerned storage allocation in Tranquil. In the instances that all data fit in PE memory, the compiler accepts them and allocates the spaces for them. The array type data is cut into blocks of size 256 x 256. Operations and i/O "between the disk and the memory are done in terms of these "blocks. The compile time allocation of space in the CU was another effort area. Routines were written for the compile time allocation of space in the CU and further work was done on the use of storage schemes for sets. The CU allocation routines involve a priority system and a compile run-time stack and include CAR and LDB usage. The set schemes are linked with their declarations, and the amount of information given or found determines whether dynamic allocation needs to be incorporated. The third area of Tranquil effort involved the beginning of the major programming effort for Pass II. The overall structure of Pass II was laid out and programmed. Partially completed were pro- grams to analyze assignment statements for their meaning and for compiling. The algorithm for determining non-dynamic variables was implemented. 3.3 Glypnir Implementation of Glypnir Version I was, with the exception of procedures, completed during the third quarter of 1968. Procedures will be implemented during the fourth quarter. The code generated by the compiler has been debugged syntactically on the current version of the assembler, and complete debugging will begin as soon as the new .ILLIAC IV simulator becomes available. A Version I user's manual is in preparation, and copies of the rough draft can be obtained. 3.4 SQUASH The syntax and semantic description of SQUASH, the debugging aid to ALGOL, has progressed. The syntax is virtually complete and 11 has "been successfully accepted "by SYNPROF, the TWS syntax processor. However, minor modifications in syntax may be necessary to facilitate the writing of semantics. But, having considered semantics, coding has "begun. 3.5 System K 3»5»1 Introduction The main activities of the SYSK group this quarter concerned many areas. These areas were the ILLIAC IV assembler, simulator and loader, design of the B65OO operating system, and development of OSK prototype features. 3.5.2 Assembler A new version of the assembler was written to serve as the foundation of the forthcoming macro assembler. Much time was spent in making this assembler as fast as possible. The assembler is sub- stantially complete. It emits pseud o- orders for the loader; it accepts the latest ILLIAC TV order code; it creates a cross-reference table; and it has extensive file manipulation facilities to aid the programmer in maintaining large programs. Extensions are now being added for TMU (the Test Maintenance Unit on the ILLIAC IV) commands. This assembler with the TMU command mode will be sufficient for the B5500 ILLIAC TV operating system at Paoli, Pennsylvania. 3.5.3 Simulator The development of an efficient, single quadrant ILLIAC IV simulator was slowed because all the operation code assignments and the instruction semantics were changed. But the simulator coding is now substantially complete. A timing simulator was written and will be integrated into the simulation package. Simulator maintenance will be handled by a new individual so that the presently involved person is free to work full time on the loader. The transition will occur on the first of October. - 12 3«5»^ Loader The "basic functions of the ILLIAC TV loader have "been de- fined. In particular, the loader related code information (address relocatability, storage assignment, external references, etc.) is now fixed, and its fields and values in the object code file assigned. The new assembler emits this loader information, and the new simulator includes a loader to interpret this information. 3.5.5 OSK (ILLIAC IV Operating System) Development There was considerable discussion of the alternative means of handling IOC interrupts on the B65OO and of the structure of the B65OO control programs. The write-up of the summarization of the decisions is in preparation. During the next quarter, a prototype version will be coded to work with the new simulator. 3.5.6 Interim OSK Features on the B5500 Progress is being made on integrating ILLIAC IV languages with the B5500 system. The deck needed for a simple assemble and execute run is listed below. ?USER = LP ? COMPILE LP/ TEST WITH ASK LIBRARY ?DATA CARD ASK assembly program ? EXECUTE LP/ TEST ?END Work is continuing on a simulator scheduler for the B5500. The goal of this project is to call the simulator into execution when- ever the B5500 is not busy on foreground tasks. No operator will be required because the simulator initiation will be automatic. Also, no user will be able to destroy the queue of simulations since the - 13 - scheduler will survive system hang-up, halt loads, and any disaster that does not destroy the disk (so far no user has ever destroyed the disk). Simulations will be scheduled using a prototype user services subsystem. The scheduler will work with any other long- running job that is suitably halt-load proofed. 3.6 CAT 3.^.1 General Compendium The CAT language project has been divided into four distinct efforts whose results will fit into the ILLIAC IV system in various places. This is a result of meetings that were held here this summer with groups of users. The four effort areas are as follows: l) gen- eral I/O routines which will add to Tranquil some kind of disk read and write statements using symbolic file names; 2) a descriptive geometry language which will allow the user to specify his problem space in a general notation rather than forcing him to tediously de- fine every index set; 3) a study of storage allocation schemes for both fast memory and disk with a view toward implementing general techniques within a compiler (a possible result may be automatic I/O which makes the ILLIAC IV disk appear to be an extension of memory); and h) a study of the optimization of disk i/O by linear programming techniques which reorder the arithmetic statements within a code to allow efficient use of the data set. In addition to the usefulness of each of these studies as an individual part of the ILLIAC IV system, it is hoped that they can be integrated into a software package for the general user. In this line, a study is also being made of the non-mathematical parts of several large programs; since such "data processing" seems to account for as much as eighty percent of the total run time in many comput- ing centers, an efficient and general system on ILLIAC IV is evidently highly useful. - Ik - 3.6.2 General Optimization In the fourth area mentioned above, work "began this quarter. This work is involving an attempt to minimize disk latency for non- core contained ILLIAC IV problems. In so doing, permutation of source code, as well as dynamic relocation of data on the disk, is being considered. At present, the method of approach involves quantizing the time axis over which any computation will take place. Binary variables are then introduced which have values corresponding to each of the actions taken (e.g., compute, input, overlay, etc.) dur- ing the particular time interval. It is hoped that, with only a few reasonable simplifying assumptions, constraints may be placed upon these binary variables. The difficulty is keeping these constraints linear. 15 - h, APPLICATIONS h.l Mathematical Applications ^.1.1 Partial Differential Equations During this quarter, a research assistant spent six weeks at Los Alamos Scientific Laboratory, Los Alamos, New Mexico. A Tranquil code was written for the Particle in Cell (PIC) method used in hydro magnetic theory. While at Los Alamos, much machine time was used for collecting statistics on various storage schemes for particle in cell methods as applied to a parallel computer. The future plan is to fully document this Tranquil code and the collected statistics during the next quarter. ^.1.2 Ordinary Differential Equations Work was completed on a system of equations for metabolic systems, and this system was summarized in ILLIAC IV Document 197 [!]• The assembly code for this problem is being kept up to date as newer versions of the assembler are available. Further study is being done on a system of equations related to the physics of a muonic atom. These equations are: dF K (r) K.F K .(r) — i = - J - J * (B * U n (r) f W.) G*(r) dr r v V ' j j + U (r) ZA.. G. K (r) dG K (r) -K.G K (r) — I = — ^ + (2 - B - U„(r) - W.)F. (r) dr r V J J + U (r) ZA..F. K (r) 2 ji l ' 16 - The object is to find an exact value for the eigenvalue B. Given an approximation to B and suitable boundary conditions, integration takes place from the inner and outer boundaries to a central "fitting radius". The match of the results at this radius determines the new value of B for the next iteration of this process. For the large problems in this area, the number of equations is larger than the number of PE's available, and a method must be devised for sharing the computation of the equations over these PE's. Assembly codes are being written to test different but like strategies and to do the actual computation. 4.1.3 Alternating Direction Iteration Scheme In these three months, an alternating direction iteration scheme in ASK was written, compiled, and time simulated. The time simulation consists of running the program for one and two iteration cycles on the Sankin Time Simulator and separating the I/O ~ 1.10 x 10 -3 sees from actual iteration time ~ 2.3 X 10 sees. The code is now in the final stages of simulation, in com- parison to the same code in ALGOL, and in documentation. The solution matrix has a small oscillation in it, otherwise the simulation is complete. Two important facts will be the result of the comparison of this code to an identical ALGOL code. One, it will check our re- sults in computation, and two, it will give a comparison of the time simulation to conventional machines. The written report will follow the completion of these two remaining problems. Also, a successive, over-relaxation iteration scheme in Tranquil has been compiled. This scheme is presently being re checked for computational errors. 4.1.4 Hydrodynamic Codes The numerical solution of the Eulerian hydrodynamic equations, in two-dimensional cartesian coordinates, was coded in Tranquil using checkerboard storage. An additional code which traces the path of - 17 - "mass-less" particles through an Eulerian grid was also completed during this quarter. Both codes have been syntactically debugged for the Tranquil compiler. An ILLIAC IV Document which will describe the methods used and the codes will appear soon. 4.1-5 Boltzmann's Equation Tre ILLIAC IV Document 200 [2] describing the implementation of a Monte Carlo method for evaluating the Boltzmann collision integral to ILLIAC IV was completed during this quarter. The method that is being used for the evaluation of the Boltzmann collision integral was developed by Arnold Nordsieck and Bruce L. Hicks of the University of Illinois [3]. The implementation of this method for ILLIAC IV requires the generation of random numbers in each PE which are also random with respect to the random numbers being generated by the other PE's. Different random number generators for ILLIAC IV which will insure this requirement are being considered. A Tranquil code will be written for this method of evaluating the Boltzmann equation. k.1.6 Matrices An algorithm to find the solution matrix X to a symmetic matrix A by using the square-root method (Cholesky [k]) was coded in Tranquil. The order of the matrix A was n = 6k, but it can without any difficulty be extended to any magnitude n < 256. From theory it is known that a symmetric matrix can be T written in the form A = SS . Using the upper triangle of A, the T transpose S is then given by T / T a s„„ = v/a., ., , s..= 1 11 " 11 ' 1J —*■ J > 1. S ll - 18 - Further, s..=v(a..- L s .. ) i>l 11 0=1 ^ x 1=1 s7 . = £== j > 1 and i < j li T s. . = i > j. ij T With A = SS and Ax = b, the result of substitution is T SS x = b. T From this S x = y and Sy = b. To find y from the lower triangle S, the following formulas are used. b n_1 Y-, = 1 > y = ("b - £ s , y,)| s , n>l. S ll l " ± T Having y, x is found by using the upper triangular matrix S and back substituting: n X n = \- > x k = (y k " £= l +1 S U X i ) I S kk ' k < m nn The problem of having a negative argument in determining the roots T in S has been taken care of by using the marker "-" (minus) with the rule in mind that '*-'* x "- "='*-". T Then s. . becomes : li T /~l V 2 s. . = - 4 a. . - L s ... li ' li ii Considering the formulas above, a "straight" storage scheme for all matrices involved is suggested. To illustrate this - 19 - T point, let us look at an 8 x 8 matrix S . Assuming the elements T S s.,e with k=i, i+1, . .., 8 for i < 3 have "been calculated, then the following is the case : s ll s 12 s 13 S l4 s 15 S l6 s 17 5 18 S 22 S 23 *2k S 25 S 26 S 2T S 28 S 33 S 3^ S 35 S 36 S 37 S 38 With s, , already known, the calculation of s, is wanted. % 5 = ^ - (S^ B 15 * S^ B 25 + S 34 S 35 )] I B kh The elements above s, . and s, are involved in this cal- culation, or stated more generally, it is: To calculate s , k > i, IK only the elements of columns i and k which are above s.. and s., ii lk enter the calculation. Furthermore, since the program has been written in such a way that the elements in row i, which are to be determined, are under SIM control, the efficiency of a "straight" storage scheme becomes even more apparent. ^•.1.7 Eigenvalues During this quarter, work was done in the investigation of eigenvalue problems. A code in assembly language has been written for Jacobi's Method for finding eigenvalues. The algorithm used in the code is a modification of the classical Jacobi Method. The matrix is divided into 2x2 sub-matrices along the diagonal, and successive orthogonal transformations are used to eliminate the off -diagonal elements of each sub-matrix. With each iteration, n elements of an n x n matrix are eliminated. The code is presently being debugged, and various storage schemes are being considered. The code is being 20 timed both on the SANKIN Simulating System and on the SIM/TIME Simu- lator. A timing estimate will also he derived from the IBM 3^0 for purposes of comparison. 4.1.8 Root Finding During this quarter, work continued on adapting Lehmer's algorithm to a parallel machine. Lehmer's algorithm determines whether or not a polynomial has a root in a given circle. Using Gerschgorin circles, the center and the radius of a circle in which all of the roots of a polynomial are found can he obtained . The problem is how to efficiently cover this area with 256 circles so that the percent of duplication is minimal. The first approach was to divide the area obtained by Gerschgorin circles into 256 identical squares. This area was then covered by circumscribing circles around the squares. The next time, this area was covered by inscribing circles, and the missed area was covered with other circles. In both of these cases, the overlap was fifty- seven percent. To reduce this overlap figure, the second approach was to divide the area into 256 identical hexagons. In this case, only cir- cumscribing circles were tried, and the duplication was twenty-one percent. Future work will be directed toward reducing the overlap figure more. 4.1.9 Special Functions Subroutine Library This quarter, work has continued in ILLIAC IV's special functions subroutine library. All previously coded subroutines have been recoded in the latest version of the ASK assembly language. The question of subroutine linkage was also considered in the rewriting of these codes. It was decided that the function argument should be in the A register upon entering the subroutine, and its evaluation would be left in the A register. It was also decided that the sub- routine itself would save all of the other registers, except the B register, and restore these at the end of the routine. The functions 21 - will be evaluated only in the enabled PE's, and all other PE's will he left alone. New 64-bit codes have been written for natural logarithm and arctangent. Also, a new 64-bit code has been developed for square root which completely eliminates division. I. Natural logarithm: Let y be the number of which it is desired to find the natural logarithm. Then ,4 1 y=2m 2< m<1 where i is the floating point number exponent and m is the floating point number mantissa. The approximation of log„m is made by a polynomial 1 ' d for p < m < 1. Then the natural logarithm is evaluated in accordance with the relation: In y = (i + log 2 m) • In 2 II. Arctangent: The approximation of arctan (x) is made by a polynomial for x € [0, Tan rt/8]. For x q (Tan */8 , Tan 3*/ 8 l Arctan (x) = Arctan (^'x-H) + *'k For x c (Tan 3n 'Q, ») Arctan (x) * jr/2 - Arctan (Vx) The above three cases all use the same approximating polynomial. III. Square Root: Let A be the number of which it is desired to find the square root. The iterative scheme used is : 2, "n+1 " "h/ " " """n .... = x/2 (3 - A x ) to approximate l/s/A Thus s/a~ = A lim x n n-*t» Starting points for this iteration are approximated by second degree polynomials. - 22 - 4.1.10 Long Codes As a first step in this task, preliminary studies were done of the Theory of Stability of Motion and of Canonical Trans- formations. The "behavior of the solutions for equations of motion having the form & = Ax was investigated. (it was assumed that A was a constant coefficient matrix.) Also, the criteria of stability of the above-mentioned autonomous systems were reviewed. Finally, a detailed illustrative example was worked out for showing the effect of errors in the observation of the initial vector x on the solution of the equations of motion for a given autonomous mechanical system. h.2 Linear Programming During this quarter, specification of the mathematical procedures for the first linear programming system, LPS, has largely been completed. Procedures for handling vector bounds, ranges of the right hand side, and basic parametric programming have been drafted. Also during the quarter, the group has examined applications of linear programming to problems of hardware design, along lines suggested by Dr. Masao Kato during his visit to the University. The solution of a linear programming problem in standard form is handled by a modification of the revised simplex method, product form. The original matrix is partitioned to produce a degree of parallelism suitable for ULIAC IV. Rows are assigned to specific PE's in a manner designed to distribute calculations evenly among the PE's. The matrix is skewed within and across quadrants to facilitate vector updating and reduced cost calculations. Multiple pricing has been adopted to minimize the number of iterations and disk accesses, thereby reducing overall calculation time and increas- ing accuracy. Tests have been initiated to evaluate the efficiency of the algorithm with regard to PE utilization and storage allocation. It is necessary at certain points in the solution procedure to recalculate the updated inverse in product form, minimizing the number of non-zero elements obtained while maintaining a high degree of accuracy. Work continues in the development of such a reinversion procedure from the several techniques which have been examined. - 23 - 4.3 Radar Processing Applications Some of the efforts during these three months have been involved in the conversion of the Kalman Filter tracking programs to 32-bit floating point mode. The development of a Tranquil version of the Kalman Filter and the analysis of the NISIM (NIKE Simulation) programs from Bell Telephone Labs were other areas involved in this quarter's efforts. The assembly language version of the Kalman Filter tracking algorithm was completely recoded so that it would run in 32-bit float- ing point mode instead of 6k- bit fixed point. These programs will handle the correlation of newly received data to locate the track table, start a new table for new targets, and move tables if required because of target changing drastically in azimuth and elevation. Also, they will perform the Bayesian estimation and will integrate the dynamic equations of motion over time for prediction and the coordinate con- versions. These programs total, approximately, ^500 instructions and have been run on the timing simulator; however, they have not been run on the ILLIAC IV execution simulator because, at present, the simulator is not complete. The main sections of the Kalman Filter programs -namely the Bayesian estimation and the integration of the dynamic equation of motion-have been programmed in ALGOL for the B5500 computer. These B5500 programs will be used to generate simulated tracking num- bers for debugging the ILLIAC IV Kalman programs. The debugging of these programs will begin as soon as the execution simulator is op- erating. Also, the Tranquil version of the Kalman Filter tracking program will be debugged and run when the compiler and execution simulator are finished. By comparing the assembly language version and the Tranquil version of the program, it will give an evaluation of how efficiently these BMD problems can be programmed in the higher level language (Tranquil) for ILLIAC IV. This can test both the flexibility of debugging Tranquil programs over assembly language programs and the operating time penalty which is paid for using the higher level language. A report will be generated in the near future which will fully describe the Kalman Filter for ILLIAC IV. - 2k - Over this time period, there has been a slowdown in progress due to changes in personnel on this effort, and this slow- down will continue for the next time period. The two graduate stu- dents working on this effort have left the University. Presently, the effort has one full time professional, one part-time hourly, and one or two new graduate students who are just starting and require time to become familiar with ILLIAC IV. A better understanding of the NISIM programs is being ob- tained in order to evaluate how this type of HMD programs would fit on ILLIAC IV. More analysis and a better understanding of them is still required before a complete sample BMD type problem can be set up for the ILLIAC IV computer. k.k ILLIAC IV Education The ILLIAC IV Education for July and August consisted of organizing and presenting a course of instruction for ILLIAC IV personnel. Many newcomers were initially confused and swamped with the enormity of the ILLIAC IV project, but toward the end of the semester most of them were well oriented and producing useful work. The course was basically divided into three sections, and a brief discussion of each follows. I. ALGOL and the B5500--The basic concepts of ALGOL and the special B5500 ALGOL features were presented. A detailed description of how to get programs running on the various B5500 hardware devices proved extremely valuable . II. ILLIAC IV Assembler -- The Assembler was explained and examples given of the various types of orders, and case studies were ex- amined. Several competent assembly language programmers emerged by the end of the summer. III. Tranquil and ILLIAC IV Applications — The Tranquil lectures had to be mostly theoretical because no facility for testing Tranquil code existed at the time of the course. General discussion of applications gave an insight into the scope and use of ILLIAC IV. - 25 - Work continued on compiling a useful programming manual for the Assembler. Parts of this have "been completed and distributed to those requiring the information. - 26 REFERENCES [1] McCarthy, Thomas. Solution of Ordinary Differential Equations Related to Metabolic Systems. ILLIAC IV Document Number 197, (July 11, 1968). [2] Winje, G. L. Implementation of the "Monte Carlo Evaluation of the Boltzmann Collision Integral" on ILLIAC IV. ILLIAC IV Document Number 200, (August 1, 1968). [3] Nordsieck, Arnold, and Hicks, Bruce L. Monte Carlo Evaluation of the Boltzmann Collision Integral, Coordinated Science Laboratory Report R-307, (July, 1966). [h] Faddeeva, V. N. Computational Methods Of Linear Algebra . New York: Dover Publications, Inc., 1959. P. 81 ff. - 27 - UNCLASSIFIED Security Classification DOCUMENT CONTROL DATA -RAD (Security qualification ol till; body ol abattmct mnd IndmalnM awwotolfow mumt ba wrttwl whan tha oratall report la claaalllad) ORIGINATING ACTIVITY (Corporal* author) Department of Computer Science University of Illinois Urbana, Illinois 6l801 2*. REPORT SECURITY C L A *SI Fl C A TlOtJ UNCLASSIFIED 26. (KOUP REPORT TITLE ILLIAC IV QUARTERLY PROGRESS REPORT July, August and September 1968 DESCRIPTIVE NOTES (Typa ol r+pott and tnelualwa data a) Progress Report AUTHOR(S) (Flrmt nama, middle Initial, laat nama) REPORT DATE November 1, 1968 7a. TOTAL NO. OF PACES 30 7b. NO. OF RCFI a. CONTRACT OR GRANT NO. ■ ^6-26-15-305 b. PROJECT NO. USAF 30(602)lmM •a. ORIGINATOR'S REPORT NUMKRISI •e. OTHER REPORT NOW (Any otfiar nuntan mat mmy ba aamlgnad thla raport) 0. DISTRIBUTION STATEMENT Qualified requesters may obtain copies of this report from DCS. I. SUPPLEMENTARY NOTES NONE 12. SPONSORING MILITARY ACTIVITY Rome Air Development Center Griffiss Air Force Base Rome, New York 13W0 I. ABSTRACT See the report summary within the Report itself. FORM I NOV SB 1473 UNCLASSIFIED Security Classification UNCLASSIFIED Security Classification KEY WO RDS Twinkle twst/tbnf LINK A LINK B ROLE LINK C ROLE WT UNCLASSIFIED Security Classification *fe a