LIBRARY OF THE 
 
 UNIVERSITY OF ILLINOIS 
 
 AT URBANA-OHAMPAICN 
 
 aop.fc 
 
The person charging this material is re- 
 sponsible for its return to the library from 
 which it was withdrawn on or before the 
 Latest Date stamped below. 
 
 Theft, mutilation, and underlining of books are reasons 
 for disciplinary action and may result in dismissal from 
 the University. 
 To renew call Telephone Center, 333-8400 
 
 UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN 
 
 L161— O-1006 
 
Digitized by the Internet Archive 
 in 2013 
 
 http://archive.org/details/sparsematrixinve443ratl 
 
liN Report No. M*3 
 
 klUZ 
 
 COO-ll»69-Ol80 
 
 SPARSE MATRIX INVERSION 
 
 by 
 
 K. Ratliff 
 
 June 1971 
 
Report No. kk3 
 
 SPAESE MATRIX INVERSION* 
 
 by 
 
 K. Ratliff 
 
 June 1971 
 
 Department of Computer Science 
 University of Illinois 
 Urbana, Illinois 6l801 
 
 Supported in part by contract US AEC AT(ll-l)lU69 
 
TABLE OF CONTENTS 
 
 Page 
 
 1. Background 1 
 
 2. Compile Time and Execution Time Processes 3 
 
 3. Pivot Selection Algorithm 5 
 
 4 . The Inversion Process 8 
 
 5. The Symbolic Inversion 10 
 
 6 . Register Optimization 15 
 
 7. Tests 18 
 
 APPENDICES 
 
 I. DIAGRAM OF SUBROUTINE INTERACTION 20 
 
 II. DESCRIPTION OF ARRAYS 22 
 
 III. FLOVJCHARTS OF SPARSE AND COMPIL .27 
 
 IV. CALLS TO COMPIL 34 
 
 V. CODE GENERATED FOR VARIOUS CASES 38 
 
1. Background 
 
 In numerical problems one is often presented with 
 an equation of the type 
 
 Ax = b 
 waere x and b are vectors of length n and A is an n x n matrix. 
 To obtain the value of x - A -1 b is an expensive project in terms 
 of computer time and memory if n is large. Typically this 
 computation is incidental to the main problem ana x may have 
 to be re-evaluated many times due to changes in b and/or A. 
 Therefore tne cost of the computation is not proportional to 
 its importance. 
 
 Additionally the A matrix may be sparse (i.e. having 
 few nonzero entries). Standard matrix inversion routines using 
 Gauss elimination require that the matrix be stored in its 
 entirety. Further the ordinary pivot selection algorithm will 
 in most cases introduce a large number of new nonzero entries 
 in A, which in turn increases the number of arithmetic or 
 needed to solve for x. 
 
 The purpose of this report is to describe a pre ,/. m 
 which saves a significant amount of computer time and space 
 in the inversion of sparse matrices. 
 
There are three factors which make the program more 
 efficient for sparse matrices: 
 
 1. The pivot selection algorithm is designed 
 
 to introduce few new nonzero matrix elements. 
 
 2. The matrix A~ is never actually computed. A 
 record (in subroutine form) is kept of the 
 operations required to produce the resultant 
 vector x. 
 
 3. The A matrix is not stored as a whole. Instead 
 various arrays are constructed which give 
 necessary information about the nonzero elements 
 only. 
 
2 . Compile Tine and Execution Time Processes 
 
 Prior to the inversion process the A matrix is analyzed 
 and its nonzero elements are stored along with information about 
 the elements. (See Appendix II) . In particular the elements 
 of A are classified according to their type; an integer is type 
 one, a constant type two, and a variable type three. 
 
 The procedure used in obtaining the x vector is Gauss 
 elimination-back substitution performed on A and b. This 
 procedure is a series of operations of the form: 
 
 a/d ■»■ a 
 
 a - b*c ■* a 
 The first type is used in the division of row elements by the 
 pivot, while the second is used in row operations. The program, 
 SPARSE, which formulates the sequence of such operations needed 
 to produce x, outputs this information in the form of two 
 subroutines, MATINV and MATMUL. The operations involving only 
 elements of A are placed in MATINV; those involving b are put 
 in MATMUL. Thus when x must be re-evaluated due to changes 
 in the b vector, only MATMUL need be called. When A charjes, 
 both subroutines must be executed. 
 
 Since the elements of A are classified by type, i t 
 is possible for SPARSE to recognize operations exclusively 
 
involving elements which never vary (types one and two) , and 
 to perform these operations at compile time (during the 
 compilation of MAT IN V and MATMUL) . In such a case no code is 
 compiled so that the compiled subroutines contain fewer 
 instructions. This is quite a saving since SPARSE is executed 
 only once, while MAT IN V and MATMUL are executed frequently. 
 
3. Pivot Selection Algorithm 
 
 The pivot selection strategy is an important part 
 of the program, since the choice of different pivots or the 
 same pivots in a different sequence greatly effects the number 
 of arithmetic operations necessary for the inversion. The 
 selection is based on the number of nonzero elements in a row 
 or column in order to avoid introducing new elements during 
 the Gauss elimination. 
 
 The following are some of the pivot selection 
 algorithms which were tested. 
 
 1. Minimum column, minimum row: This strategy 
 compares all columns of A with respect to 
 the number of nonzero elements in the column. 
 The column (s) which is minimum is then 
 searched for the element with the minimum 
 number of nonzero elements in its row. (The 
 nonzero comparisons refer only to that part 
 of A not already involved in pivoting.) 
 
 2. Minimum row, minimum column: This algorithm 
 is the same as the first applied to the 
 transpose of A. 
 
3. Minimum product: Each matrix element is 
 compared according to the product of the 
 nonzero element count of its row and that 
 of its column, 
 rhe basis for judging the efficiency of a method was the number 
 of computer operations required to obtain x. (See Chapter 7.) 
 rhe first two algorithms were found to be about equivalent, 
 rhe minimum product strategy was much more efficient in some 
 rases, but less efficient in the majority of examples. 
 
 The present program uses the first method. In the 
 search for an optimal pivot candidate, two or more elements 
 nay have identical nonzero row and column counts. In such cases 
 the new element, b, is tested to see if it is the integer one. 
 [f so, b is the best element so far. If b is not equal to one, 
 Its type (see Chapter 2) is compared with that of the former 
 sivot candidate, a. If b is of a simpler type than a, it is 
 sreferred as a pivot. If a and b are both type one (integers), 
 i is retained as the best choice. If a and b are both type 
 :wo or tiiree (constants or variables), they are compared by 
 ;ize and the larger (in absolute value) is chosen. In summary 
 lonvarying elements, types one and two, are preferred to those 
 
which change when the A matrix changes, and large elements are 
 preferred to small ones. 
 
 The motivation for choosing elements of types one 
 and two rather than those of type three is that row operations 
 or pivot divisions involving only nonvarying elements can be 
 performeu at compile time as explained in Chapter 2. 
 
 Integers are preferred over constants, since integer 
 arithmetic not involving division is exact. 
 
4. The Inversion Process 
 
 The program consists of a group of routines which 
 are divided into two types; those which are used initially to 
 set up and analyze the A matrix, and those used repeatedly to 
 obtain the x vector. Appendix I gives a diagram of the 
 interaction of these programs. 
 
 The routine SETUP codes information about the A matrix 
 for use by SPARSE. It does this by differentiating a series 
 of expressions E Q to form 
 
 3E. 3E. 
 
 SPARSE performs the bulk of the decision making for the 
 inversion. SPARSE calls COMPIL (an assembly language program) 
 to generate the routines (MATINV and MATMLL) whicli manipulate 
 A and b to obtain x. 
 
 As was mentioned previously, the inverse of A is never 
 calculated explicitly, rather SPARSE records only those 
 operations which directly effect the final value of x. These 
 essential operations are of two types which correspond to f orri ng 
 A~ and finding the product A - £>. The former arithmetic is 
 contained in the routine MATINV and the latter in MATMLL. Hence 
 
if b changes more often than A, the x vector can be obtained 
 frequently by calling MATMUL only. When the A matrix changes, 
 both MATIWV and MATMUL must be called in that order. 
 
 Since all decisions pertaining to pivot selection, 
 etc., are made by SPARSE, the generated routines are very simple 
 and execute rapidly. They are straight-line code using only 
 basic machine instructions (LOAD, STORE, ADD, SUBTRACT, MULTIPLY, 
 DIVIDE, and LOAD COMPLEMENT). 
 
 The COMPIL routine is also simple in that its function 
 is to translate three parameters into one instruction and store 
 that instruction in the appropriate location in memory. The 
 parameters to COMPIL indicate the arithmetic operation, the 
 array location containing the operand, and the subroutine (MATINV 
 or MATMUL) which is effected. (See Appendix IV) 
 
10 
 
 5 . The Synbolic Inversion 
 
 This section will outline in iaore detail the process 
 by which MAT IN V and MATMliL are generated. The SLTLP routine 
 uses eight arrays to store data about the A matrix. These 
 arrays are described in Appendix II. 
 
 Since only nonzero elements are considered, information 
 about the position of these elements must be available. This 
 data, along with pointers to the next elements in the sane row 
 and column, information about the simplicity of the element, 
 and the location of its numerical value are passed tc the SP/iRLSE 
 routine. 
 
 Basically SPARSL analyzes the attributes of the A 
 matrix and tries to determine the best code for MATIdV and 
 MATMLL. It can be divided into three sections: selection of 
 a pivot, Gauss elimination and back substitution. 
 
 The pivot selection process uses the algorithm 
 described in Chapter 3. The matrices used to store information 
 about A contain information about the inversion process itself 
 at different stages of the analysis of the matrix. The set 
 cf columns which has a minimum nonzero count is recorded by 
 means of a chain. The pointers to columns in the chain arc 
 stored in the location formerly used for nonzero column counts. 
 
11 
 
 (This space is no longer needed since all columns in the chain 
 have the sane nonzero count, which is stored in a single 
 location. ) 
 
 When all columns have been compared , those in the 
 minimum chain are searched for the element which is the best 
 pivot according to nonzero row count, type, ana size. After 
 a column has been searched in this fashion the information about 
 its nonzero count is restored if the column is not used for 
 pivoting. The space for the nonzero count on the column which 
 does contain the pivot is set to zero during pivoting ana to 
 minus one after pivoting. 
 
 When the pivot element is chosen, the instructions 
 to divide the elements in the pivot row and the b vector element 
 corresponding to the pivot row are compiled (if the pivot equals 
 one no instructions are compiled) . 7\s each pivot row element 
 is divided, its row number is set negative to indicate that 
 it is in a pivot row. Then the nonzero count in the column 
 of this element is reduced by one. During the divisions the 
 pivot element is neglected (as are elements in pivoted columns) 
 and at the completion of the pivot division it is removed from 
 the row chain. Finally the b vector element is divided by the 
 pivot. 
 
12 
 
 The elements in the pivot column are then checked 
 to determine which ones are to be eliminated (those not in 
 pivoted rows). When an element to be eliminated is founu, its 
 row is searched simultaneously with the pivot row to decide 
 what row operations are necessary. The indicated instructions 
 are compiled to change the elements in the eliminated-element 
 row and the corresponding b entry. (See Appendix V) 
 
 Luring the elimination process the compiled arithmetic 
 is also performed on the variable elements (type three) by 
 SPARSE, since it is necessary to avoid choosing an element as 
 pivot which has value zero or is very small. 
 
 As the pivot column is being searched in the 
 elimination process, the pivot and eliminated elements are 
 removed from the column chain (i.e. the preceding column element 
 is made to point to the element following the one removed) . 
 The place in the I IX array (see Appendix II) formerly occupied 
 by an eliminated element is placed on a chain of free elements. 
 These spaces are then reused when a new matrix element is 
 generated by the elimination process. The pivot element is 
 placed on a backward pivot chain (i.e. the beginning of the 
 chain is the most recently chosen pivot). Pointers to elements 
 in the free element chain are in the fix location normally used 
 
13 
 
 to give the type of the element, while the back pivot chain 
 pointers occupy the location for the column chain pointers. 
 
 When all elements in the pivot column have been checker 
 (and eliminated where appropriate) another pivot is selected 
 and the procedure is repeated. (The last pivot requires little 
 processing, since it is the only nonzero element in its row. 
 Instructions are compiled to simply divide the b entry by the 
 pivot value and store the result in x. ) 
 
 The back pivot chain is used in the back substitution 
 process in the following way. The column of the first pivot 
 on the chain is searched for remaining elements ana as they 
 are found, instructions are compiled to alter the b vector in 
 those entries where the corresponding rows of A have elements 
 in the pivot column. This continues until the pivot is the 
 only element in the column. Whenever an element is the last 
 nonpivot in its row, the b entry for that row is stored in the 
 x vector in the position corresponding to the column number 
 of the pivot in the row. This indicates that no further changes 
 are necessary in the row. 
 
 When a pivot column contains only the pivot element, 
 the pivot is removed from the back pivot chain anu the process 
 is repeated for the next pivot. After all pivots have been 
 
14 
 
 used in the back substitution process, the x vector is in its 
 final form (x = A _1 b) . Flow charts of SPARSE and COMPIL are 
 given in Appendix III for more specific information on the 
 mechanics of the program. 
 
15 
 
 6 . Register Optir.ization 
 
 The COMPIL subroutine generates the machine code for 
 riNV and MATML'L in such a way that all computation is performed 
 in floating point registers and 2, with general registers 
 1 through 15 used as base registers in addressing the arrays 
 PW , XIN, and XOUT. (General register holds the address of 
 the beginning of MATINV or IIATMbL at execution time.) MATINV 
 and IIATMLL are called with the addresses of the arrays as 
 parameters so the subroutine can use relative addresses in their 
 register-to-storage instructions. 
 
 The XIN and XOLT arrays have fixed length, n, and 
 
 are referenced using registers 14 and 15, respectively, as base 
 
 q 
 registers. These are adequate for all n < 2 . The PW array 
 
 on the other hand has variable length depending on the number 
 
 of nonintegers in the A matrix. The limitation on its length 
 
 is imposed by n (i.e., #PW entries < 2 ). Since there are 
 
 only 13 index registers available for PW , at any one time only 
 
 13*2~ locations can be accessed (if PW is double precision, 
 
 this figure is further reduced to 13*2"). 
 
 This restriction on the length of PW is circumvented 
 
 in the following way. Initially the address of PW(1) is in 
 
 register 1 and those of PW(n*2 10 ), n = 1, 12 in registers 2 
 
1G 
 
 through 13. (This is done by taking the PW parameter address 
 as the value of register 1 and adding U09G to this for each 
 successive register.) Thereafter when a PW location not 
 currently accessible is required, the base register useu lease 
 recently is reset to contain the needed base address. 
 
 The recomputation of the base register value is done 
 within the subroutines by instructions also generated by COMPIL, 
 In order to accomplish this, COMPIL keeps a count of the 
 instructions generated and stores the current count in a table 
 in a position corresponding to the base register used in the 
 instruction. The decision of what base address to discard is 
 made by searching this table. 
 
 A second table is needed to indicate which base 
 registers contain various base addresses. For example, if 
 PW(14,000) is needed for an instruction, the entry in this 
 second table corresponding to base address 14 (indicating 
 PW (13312)) is checked. If there is a positive number in that 
 position in the table, it gives the base register to be used 
 for PW(14,000); if there is a negative number, no base register 
 is currently being used for that part of the PW array. In the 
 latter case the last used register is set to the desired base 
 
17 
 
 address and the table entry for its former base address is made 
 negative. 
 
 In the matrix inversion computations performed by 
 the generated suoroutines , floating point register 2 is uscct 
 when possible for the values of elements which are used several 
 times in succession (i.e., the value of the pivot during the 
 division of the pivot row, or the value of the eliminated element 
 during the row operations due to the elimination step) . Floating 
 point register is used for those values which change with 
 each operation (i.e., the pivot row elements to be divided or 
 the elements in the pivot row used in operations of the type 
 a - b*c ■* a) . In this way more economical register-to-register 
 instructions can be used, with register being reloaded for 
 eacn new element used and register 2 remaining fixed for a 
 period of several operations. These decisions are made within 
 SPARSE. 
 
 Elements with special properties also provide the 
 opportunity to cut down on the number of register-to-storage 
 instructions. A comprehensive description of these properties 
 is given in Appendix V. 
 
18 
 
 7. Tests 
 
 Two types of tests have been employed in the 
 development of the program. The first type checks the efficiency 
 of the pivot selection algorithms and the second type determines 
 the accuracy of the inversion. 
 
 The pivot selection algorithm test consists of varying 
 the initial part of the SPARSE program so that different methods 
 of pivot selection are used on the same examples with all 
 procedures following the selection of a pivot remaining fixed. 
 The number of times COHPIL is called (equivalent to the number 
 of instructions generated for the subroutines) is compared for 
 all methods tested. A summary of these tests was given in 
 Chapter 3. 
 
 The accuracy tests refer to the validity of the matrix 
 inversion procedure as well as to the numerical accuracy of 
 the results. A check on the inversion procedure was accomplished 
 by printing all operations performed during the execution of 
 SPARSE and duplicating these by hand on the same matrices. 
 The examples useu in this way had eight as a maximum value of 
 n. 
 
 The test for numerical accuracy (which of course 
 implies a check on the inversion logic) is a program which 
 
19 
 
 calculates an arbitrary b vector and calls SETUP, MATSET , and 
 SPARSE to produce MATINV and MATMUL for some A matrix. The 
 two generated routines are then executed to produce an x vector, 
 Finally the following is calculated 
 
 | |Ax - b] | 
 to measure the accuracy of the output vector, x. 
 
20 
 
 APPENDIX I 
 DIAGRAM OF SUbROUTINE INTERACTION 
 
21 
 
 SPARSE 
 
 SYMBOLICALLY INVERTS 
 A AND PREMULTIPLIES 
 b BY A" 1 (PIVOT 
 SELECTION, GAUSS 
 ELIMINATION , AND 
 BACK SUBSTITUTION) 
 
 SETUP 
 
 GENERATES THE 
 ARRAYS 
 CONTAINING 
 INFO. ABOUT A 
 
 * MATSET 
 
 EVALUATES 
 THOSE ELEMENTS 
 IN A WHICH ARE] 
 VARIABLES 
 
 COMPIL 
 
 SET3 EACH MACHINE 
 INSTRUCTION IN MEMORY 
 IN THE APPROPRIATE 
 SUBROUTINE 
 
 T 
 
 MATINV 
 
 £ 
 
 MATMUL ♦ 
 
 PERPORMS 
 ARITHMETIC 
 RELATED TO 
 FORMING A" 1 
 
 PERFORMS 
 ARITHMETIC 
 RELATED TO 
 FORMING A _1 b 
 
22 
 
 APPENDIX II 
 DESCRIPTION OF ARRAYS 
 
23 
 
 MX is an integer array which is the primary source 
 of information about the elements of A. It is dimensioned 6 
 x K, where K is the upper limit on the number of elements 
 possible in A. (Usually n*n) There are six entries for each 
 element which contain the following data. 
 
 fix (1,1) contains the type of the element, 
 initially 1, 2, or 3, meaning 
 integer, constant, or variable, 
 respectively. If the type is 0, the 
 element has been reduced to zero value 
 and will be removed from A. This space 
 is also used for pointers in a chain of 
 free spaces in MX, where the previous 
 element has been removed. 
 
 fix (2, I) contains the row number of the element. 
 This is set to minus the row number 
 when the row is pivoted. 
 
 MX (3, 1) contains the column number of the element, 
 
24 
 
 MX (4, 1) points to the location of the next row 
 element in MX. 
 
 MX (5,1) points to the location of the next 
 
 column element in MX. If the element 
 is a pivot this points to the next 
 element in the backward pivot chain. 
 
 MX (6, I) contains the numerical value of type 1 
 elements and the location of this 
 value in PW (see below) for type 2 and 
 3 elements. 
 IR1, IC1, IR2, and IC2 are integer arrays dimensioned 1 by n, 
 IR1(I) gives the location in MX of the first 
 element of row I. 
 
 IC1(I) similarly gives the first element of 
 column I. 
 
 IR2(I) contains the number of nonzero elements 
 in row I . 
 
25 
 
 column I. During the pivot selection this 
 slot is a pointer to the next column in a 
 chain of columns with minimum nonzero 
 counts. IC2(I) is set to while the 
 I-th column is being pivoted and to -1 
 after pivoting. 
 PW is a floating point array dimensioned 1 by K where 
 K is the upper limit on the number of constants or variables 
 possible (usually n*n) . It is used to store the numerical 
 values of noninteger elements in A. The values of variables 
 are set into PW by the routine MATSET which computes new values 
 in A. The values of constants are set in PW by the routine, 
 SLTuP. The first twenty locations are filled with nonzero 
 integers between -10 and +10. These locations are referenced 
 wnen computation must be compiled involving an integer. 
 
 IQ and IP are integer arrays, dimensioned 1 by n and 
 1 by n**2, respectively, containing information about the 
 position of elements in PW which are variables. These arrays 
 are used by MATSET to compute the PW entries initially and each 
 time the A matrix cnanges. 
 
 IQ(I) gives the number of elements to be computed 
 in the I-th column of A. 
 
26 
 
 IP (J) gives the row numbers of successive 
 entries in PW which are variables. 
 For example, if IQ(1) = 4 and the first four entries in IP 
 were 6, 8, 11, 15, the nonzero variable elements in column 1 
 of A would appear in rows 6, 8, 11, 15. 
 
 XIN and XOUT are double precision arrays dimensioned 
 1 dv n which correspond to the vectors b and x, repsectively. 
 They are used by MATMUL. 
 
27 
 
 APPENDIX III 
 FLOWCHARTS OF SPARSE AND COMPIL 
 
28 
 
 
 
 
 
 
 
 K 
 
 
 
 
 
 
 
 
 £ t 
 
 
 SE" 
 
 
 
 8*3 J 
 
 it 
 
 
 pa. a 
 
 
 
 = £5 
 
 
29 
 
30 
 
3l£ 
 
 — jg Ul Ul 
 
 5 5 £B 
 1 I 
 
 SSal 
 
 31 
 
 W £? !-• H Ul 
 
 2 5 -i o « 
 
 S- ft. fa" > « 
 
 q2 £ £ SB 
 
 O Z > >• p « 
 
 fal Q < M E-> ft. 
 
 > as to ww 
 
 W ^ K ° « H 
 
 |9! 
 3! 
 
32 
 
 
 o 
 
 
 
 
 - 
 
 
 
 
 
 i 
 
33 
 
 
 3 
 2 
 
 
 B * B S •• 
 
 s 
 
 
 
 
 
 
 
 
 e 
 
 
 
 
 — •< 
 
 
 
 
 ■ 
 
 — » 
 
 
 
 
 
 
 1 
 
 
 
 
 
 
 
 ' 5 ! 
 
 
 ? 
 
 SEg 
 
 
 
 
 
 
 
 
 
 1 
 
 
 
 i\. 
 
 
 
 
 
 - \ 
 
 
 - 
 
 3 3 X 
 
 ■ - 
 
 * 
 
 
 
 
 ■ 
 
 
 
 
 
 
 
34 
 
 APPLiMDIX IV 
 CALLS TO COMPIL 
 
35 
 
 The form of the call to COMPIL is 
 
 Call COMPIL (I, J, K) . 
 The following tables give the values of the three parameters 
 ana the corresponding interpretation. 
 
 Vhe first parameter may assume 25 different values, 
 most inuicating a machine instruction involving floating point 
 registers. 
 
 I If K=0 when 1=0 If K=l when 1=0 
 
 -1 
 
 Compile end 
 
 of subrts 
 
 same 
 
 
 
 Compile beginning 
 
 of 
 
 
 
 subrts 
 
 
 
 same 
 
 1 
 
 LE 2,- 
 
 
 
 LD 2,- 
 
 2 
 
 LE 0,- 
 
 
 
 LD 0,- 
 
 3 
 
 LD 0,- 
 
 
 
 same 
 
 4 
 
 STE 0,- 
 
 
 
 STD 0,- 
 
 5 
 
 STD 0,- 
 
 
 
 same 
 
 6 
 
 LCEP 2,2 
 
 
 
 LCDR 2,2 
 
 7 
 
 LCDR 0,0 
 
 
 
 same 
 
 8 
 
 DER 0,2 
 
 
 
 DDR 0,2 
 
 9 
 
 I LLP. ,2 
 
 
 
 MDR 0,2 
 
 10 
 
 bE 0,- 
 
 
 
 Db 0,- 
 
 11 
 
 ILL ,- 
 
 
 
 MD 0,- 
 
 12 
 
 DD 0,- 
 
 
 
 same 
 
 13 
 
 \j 0,- 
 
 
 
 same 
 
 14 
 
 AE 0,- 
 
 
 
 AD 0,- 
 
 15 
 
 AD 0,- 
 
 
 
 same 
 
 16 
 
 LCER 0,2 
 
 
 
 LCDR , 2 
 
 17 
 
 LER 0,2 
 
 
 
 LDR 0,2 
 
 lb 
 
 LCER 0,0 
 
 
 
 LCDR 0,0 
 
 19 
 
 SD 0,- 
 
 
 
 same 
 
 20 
 
 Lb 2 ,- 
 
 
 
 same 
 
 21 
 
 LCbR 0,2 
 
 
 
 s ame 
 
 22 
 
 SDR 0,2 
 
 
 
 same 
 
 23 
 
 ADR , 2 
 
 
 
 same 
 
36 
 
 The second parameter gives the array index of the 
 element to be used in register-to-indexed-storage instructions. 
 In register-to-register instructions, it is ignored. 
 
 The third parameter may take on four different 
 values which in general indicate the array referenced in r-x 
 instructions and the subroutine (MATMUL or MATINV) which is 
 being added to. The following table gives the values of k and 
 their meaning. 
 
 k When 1^0 dhen 1 = 
 
 array index refers to PW array is 
 
 PW, instruction is for single precision 
 MATINV 
 
 1 array index refers to PW array is 
 
 PW, instruction is for double precision 
 MATMUL 
 
 2 array index refers to 
 
 XIN, instruction is for not to be used 
 r'iATMUL 
 
 3 array index refers to 
 
 XOLT, instruction is for not to oe used 
 MATMUL 
 
37 
 
 To illustrate the manner in which instructions are 
 generated for MATINV and MATMUL, take the following example. 
 
 The pivot element throughout the example is A(6,4); 
 its value is in PW(13) / which indicates it has value +3. There 
 is a non-zero element in A (6 ,7) with value in PW(25). The 
 element being eliminated in the Gauss elimination process is 
 A (8, 4) , stored in PW(30). There is an element 7.(8,7) stored in 
 Pis (32). During the back substitution process there is an 
 element A(2,4), located in PW(35); it is the last non-pivot in 
 row 2, and the pivot for row 2 is in column 5. 
 
 CALLS TO COMPIL 
 
 CODL FOR MATINV 
 
 CODE POD MATMUL 
 
 CALL COMPIL 
 
 CALL COMPIL 
 
 CALL COMPIL 
 
 CALL COMPIL 
 
 CALL COMPIL 
 
 CALL COMPIL 
 
 CALL COMPIL 
 
 CALL COMPIL 
 
 CALL COMPIL 
 
 CALL COMPIL 
 
 CALL COMPIL 
 
 CALL COMPIL 
 
 CALL COMPIL 
 
 CALL COMPIL 
 
 CALL COMPIL 
 
 CALL COMPIL 
 
 CALL COMPIL 
 
 CALL COMPIL 
 
 CALL COMPIL 
 
 CALL COM.PIL 
 
 CALL CuMPIL 
 
 CALL COMPIL 
 
 CALL COMPIL 
 
 1,13,0) 
 2,25,0) 
 8,0,0) 
 4,25,0) 
 
 LE 2, PW(13) 
 LE 0, PW(25) 
 DR 0,2 
 STE 0, PW(25) 
 
 More pivot row elements 
 Doing divided will follow 
 this repeatinq the last 
 three instructions 
 
 3,6,2) The b element LD 0, XIN (6) 
 10,13,1) is being DE 0, PW(13) 
 5,6,2) divided STD 0, XIN (6 ) 
 
 1,30,0) LE 2, PW(30) 
 
 6,0,0) LCER 2,2 
 
 2,25,0) LE 0, PW(25) 
 
 9,0,0) MER 0,2 
 
 14,32,0) AE 0, PW(32) 
 
 4,32,0) STE 0, PW(32) 
 
 More elements in the 
 eliminated rcw will be 
 cr. a n a e d foil ow in g this, 
 repeating the last 
 four instructions 
 
 2.30.1) The b element LE 0, PW(30) 
 7,0,1) in the row of LCDR 0,0 
 
 13.6.2) the elim eleto ML 0, XIN (6) 
 15,3,2) is being chang- AD 0, XIN (8) 
 5,8,2) ed due to elim. STD 0, XIN (8) 
 
 2 0,6,2) This is the 
 
 21,0,1) back subst 
 
 11,35,1) proc which 
 
 15,2,2) puts the 
 
 5,4,3) result in x 
 
 LD 2, XIN (6) 
 LCDR 0,2 
 ME 0, PW(35) 
 AD 0, XIN (2) 
 STD 0, XOu"T(5) 
 
38 
 
 APPENDIX V 
 CODE GENERATED FOR VARIOUS CASES 
 
39 
 
 This Appendix shows various cases of 
 a/d ■* a and a - b*c ■* a 
 which occur in SPARSE, and the resulting computation. The 
 following abbreviations are used: 
 PV: Pivot element 
 P: Pivot row element (^ PV) 
 XINi: Entry in b vector corresponding to row 
 of i element 
 XOUTj_: Entry in x corresponding to column of pivot 
 in i element row 
 T#: Type of element, Tl = Type one, etc. 
 E: Eliminated element 
 X: Eliminated row element (5* E) 
 
 These are snown in the figure below. 
 
 ARRAY A: 
 
 — F-6 
 
 Pivot 
 
 ROW 
 
 Pivot Column 
 
40 
 
 Operations of the Type a/d ■♦ d (if PV = 1, no operations required) 
 
 p?" = Tl, PV = Tl or T2 
 
 NO CODE COMPILED 
 form P/PV -*- P (if new P = T2 get new space in PW] 
 
 | P = T2, PV = Tl or T2 j 
 
 form P/PV 
 
 P = T3 or PV = T3 
 
 NO CODE COMPILED 
 
 MATINV CODE 
 
 Compile *LE 2, PV (compiled only once for each pivot) 
 LE 0, P 
 DER 0,2 
 STE 0,P 
 
 When a is in the b. vector 
 
 MATMUL CODE 
 
 compile LD 0, XIN p 
 DE 0, PV 
 **STD 0, XIN pv 
 
 * If only one non-pivot in pivot row, compile LE 0, P 
 
 DE 0, PV 
 STE 0, P 
 
41 
 
 II. Operations of the Type a - b*c -> a 
 P = Tl, E = Tl, X = Tl 
 
 NO CODE COMPILED 
 form X - E*P ■* x (if new X is too large make X = T2) 
 
 HO CODE COMPILED 
 
 P = Tl or T2, E = Tl or T2, X = Tl or T2 
 
 form X - E*P ■* X (change X to T2 if it was Tl, get 
 
 space for it in PW array) 
 
 P = Tl or T2, E = Tl or T2, X = T3 
 
 I1ATIMV CODE 
 
 form y = -P*E 
 
 compile LE 0, y 
 
 AE 0, X <if X ± 0) 
 STE 0, X 
 
 P = T3, E = ±1, X = Tl, T2 or T3 | 
 
 MATINV CODE 
 
 compile LE , P 
 
 LCER 0, (if E = 1) 
 AE 0, X (if X i- 0) 
 STE 0, X 
 
 NOTE: In the following two cases the instructions 
 
 LE 2, i! 
 
 LCER 2,2 (omitted if E = Tl, since complement already formed 
 
 are compiled on the first row operation and are not repeated for 
 
 subsequent operations for this E. Alternate code is given in 
 
 footnotes if there is only one operation for this E. If X = Tl, 
 
 it is changed to T2 (floating point) for computation and space in 
 
 PW is reserved for the new value. If X = T2 space for the new X 
 
 is also reserved. If E = Tl (not +1) it is put in PW complemented. 
 
P = ■fi / e = T3, X = El, T2 or T3 
 
 42 
 
 MATINV CODE 
 
 ♦compile: LER 0, 2 (P = 1) or LCER 0, 2 (P = -1) 
 AE f X (if X/ 0) 
 STE 0, X 
 
 P = T3, E jj ±1, X = Tl, T2 or T3 
 
 MATINV CODE 
 
 ** compile: LE 0, P 
 MER 0, 2 
 
 i\E 0, X (if X ^ 0) 
 STE 0, X 
 
 * If only one row operation compile LE , E 
 
 LCER 0, (if P = +1) 
 AE 0, X (if X jt o) 
 STE 0, X 
 
 ** If only one row operation compile LE , P 
 
 LCER 0, (if E ? Tl) 
 
 ME 0, E 
 
 AE 0, X (if X/ 0) 
 
 STE 0, X 
 
43 
 
 When a is in b_ vector 
 
 MATMUL CODE 
 
 if E / +1 compile LE 0, E 
 
 LCDR 0, (if E ^ Tl) 
 MD 0, XINp 
 AD 0, XIN E 
 STD 0, XIN E 
 
 if E = +1 compile LD 0, XlUg. 
 
 SD (if E = 1) or AD (if E ■ -1) 0, XIN p 
 
 STD 0, XIN E 
 
 The back substitution process operations are of the form 
 
 a _ b*C -*■ a ALL MATMUL CODE 
 
 Wnether a value is in XIN or XOUT depends on if the x 
 vector entry for the pivot row has reached its final value. 
 For each pivot initially compile 
 LD 2, XINpv or XODTpy- 
 
 E = +1 
 
 E = -1 
 
 E ? ±1 
 
 LD 0, XIN E LD 0, XIN E 
 
 SDR 0, 2 ADR 0, 2 
 
 STD 0, XIN E or STD 0, XIN £ or 
 XOUTp XOUTV 
 
 LCDR 0, 2 
 
 MD 0, E 
 
 AD 0, XIN E 
 
 STD 0, XIN or XOUT E 
 
 E <ww* E 
 
 NOTE: If a pivot is the only element in its row and the x entry 
 
 for its row has not been placed in XOUT by the above process 
 
 compile LD 0, XIi\ T pv 
 
 STD 0, XOUT pv 
 
FocmAEC-427 
 
 16 68) 
 
 AECM 3201 
 
 U.S. ATOMIC ENERGY COMMISSION 
 
 UNIVERSITY-TYPE CONTRACTOR'S RECOMMENDATION FOR 
 
 DISPOSITION OF SCIENTIF C AND TECHNICAL DOCUMENT 
 
 ( See Instructions on Reverie Side ) 
 
 1 AEC REPORT NO. 
 
 000-lU69-0l80 
 
 2. TITLE 
 
 SPARSE MATRIX INVERSION 
 
 3. TYPE OF DOCUMENT (Check one): 
 
 E a. Scientific and technical report 
 _J b. Conference paper not to be published in a journal: 
 
 Title of conference 
 
 Date of conference 
 
 Exact location of conference 
 
 Sponsoring organization 
 
 □ c. Other (Specify) 
 
 4 RECOMMENDED ANNOUNCEMENT AND DISTRIBUTION (Check one): 
 
 Iyi a. AEC's normal announcement and distribution procedures may be followed. 
 
 ~2 b. Make available only within AEC and to AEC contractors and other U.S. Government agencies and their contractors. 
 I I c. Make no announcement or distribution 
 
 5. REASON FOR RECOMMENDED RESTRICTIONS: 
 
 6. SUBMITTED BY: NAME AND POSITION (Please print or type) 
 
 C. W. Gear, Professor 
 
 and Principal Investigator 
 
 Organization 
 
 Department of Computer Science 
 University of Illinois 
 Urbana, Illinois 6l801 
 
 Signature 
 
 ^jKhJ&iktf**- 
 
 Date 
 
 June 1971 
 
 FOR AEC USE ONLY 
 
 . AEC CONTRACT ADMINISTRATOR'S COMMENTS, IF ANY, ON ABOVE ANNOUNCEMENT AND DISTRIBUTION 
 RECOMMENDATION: 
 
 . PATENT CLEARANCE: 
 
 a. AEC patent clearance has been granted by responsible AEC patent group. 
 
 1 I b. Report has been sent to responsible AEC patent group for clearance. 
 I 1 c. Patent clearance not required. 
 
< 
 
 %