LIBRARY OF THE 
 
 UNIVERSITY OF ILLINOIS 
 
 AT URBANA-CHAMPAIGN 
 
 510.84 
 
 IJKor 
 
 no. 764-76 9 
 
 cop. Z 
 
3SSH ~^Sta« on or before the 
 Latest Date stamped below. 
 
 HOV 3 
 
 lyHfiCl 
 
 APR 3 wt* 
 MAY 1 4 REC1 
 
 L16 i_O-1096 
 
^ y UIUCDCS-R-T5-T65 
 
 / / 1»- 
 
 *-/ 
 
 COO-2383-00?6 
 
 Block Envelope Solution of Finite Difference Systems 
 
 by 
 Andrew H. Sherman 
 
 November 1975 
 
 DEPARTMENT OF COMPUTER SCIENCE 
 UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN 
 
 URBANA, ILLINOIS 
 
 2HE LIBRARY OF TH£ 
 
 UNIVERSITY OF ILLINOIS 
 T ""^ANA-CHAMPAIGN 
 
Digitized by the Internet Archive 
 in 2013 
 
 http://archive.org/details/blockenvelopesol765sher 
 
UIUCDC5-R-75-765 COO -2 38 3- 002 6" 
 
 Block Envelope Solution of Finite Difference Systems* 
 
 by 
 Andrew H. Sherman 
 
 November 1975 
 
 DEPARTMENT OF COMPUTER SCIENCE 
 
 UNIVERSITY OF ILLINOIS AT URBAN A- CHAMPAIGN 
 
 URBANA, ILLINOIS 6l801 
 
 * This research was performed while the author was a student at Yale 
 University. Supported in part by the Office of Naval Research under 
 contract N00lH-67-A-0097-00l6 and the Atomic Energy Commission under 
 contract US AEC AT(ll-l) 2383. 
 
ABSTRACT 
 
 In this paper we describe an envelope-oriented block Gaussian 
 elimination method for solving linear systems arising from the use of 
 
 five-point finite difference approximations on n x n square grids. We 
 
 5/2 7/2 
 
 show that the method requires 0(n ) storage locations and 0(n ^) 
 
 arithmetic operations. Asymptotically this is a significant improvement 
 on the costs of ordinary band and envelope methods, but we find that 
 for moderate values of n (say, n <: 50) , the band and envelope methods 
 are superior. 
 
1. Introduction 
 
 Given a positive integer n, let D be a regular n x n square grid 
 
 2 
 with an unknown x. associated with each of the n grid points. Letting 
 
 2 
 
 N = n , we consider the solution of an N x N system of linear equations 
 
 Ax = "b, (1.1) 
 
 where a. . 4 if and only if the nodes corresponding to x. and x. are 
 connected by a line in the grid. Systems of this form arise in the use 
 of five-point finite difference approximations in the numerical solution 
 of second-order elliptic partial differential equations over sauare 
 regions. In order to simplify our discussion somewhat, we will assume 
 that A is symmetric and positive definite, although similar results hold 
 in certain other cases (e.g., when A is nonsymmetric and diagonally 
 dominant) . 
 
 A possible method for solving (l.l) is the Choleski decomposition, 
 
 T 
 i.e. , first factoring A into the product LL , where L is lower triangular, 
 
 T 
 and then solving Ly = b followed by L x = y. We consider here a variant 
 
 of this method in which we do not explicitly obtain L. Either algorithm 
 
 may be stably applied to the permuted system 
 
 (PAP T )(Px) = (Pb) (1.2). 
 
 instead of (l.l), for any N x N permutation matrix P [lTl» and. this fact 
 
 allows us to vary the ordering of the grid points (i.e., the correspondence 
 
 2 
 between the unknowns x. and the n grid points of D ) in any way that seems 
 
 useful. In particular, we will attempt to minimize the amount of storage 
 
 or work required to solve the resulting linear system (1.2). These two 
 
 goals are related, but it is usually not possible to attain both 
 
 simultaneously. 
 
If L is to be computed explicitly, then there are several different 
 algorithms which may he used to factor A. Primarily, these differ in the 
 degree to which they take advantage of the known zero structures of A and 
 L. Asymptotically, the most efficient in terms of both storage and wor> 
 are the general sparse matrix methods, which store and operate only on the 
 nonzero entries of A and L. With an appropriate ordering of the grid points 
 (e.g., the nested dissection ordering [l]), these require only 0(n log n) 
 storage locations and 0(n ) arithmetic operations [1,8]. However, the 
 overhead, i.e., the number of extra storage locations and nonnumeric 
 operations required to avoid storing and operating on zeroes in A and L, 
 may be quite high for general sparse methods. Furthermore, they may not 
 perform very well in paged storage environments, since they tend to access 
 matrix entries which are physically stored far apart. General sparse 
 matrix methods have been primarily used for large problems ,■ although it 
 has been found that they may offer savings even when n is small [9l- 
 
 In contrast, we can use standard band [ll] or envelope [T»l^] 
 algorithms. These are easy to implement and use and have low overhead 
 requirements, but they require 0(n ) storage locations and 0(n ) arithmetic 
 operations asymptotically [3]. Hence these methods are mainly useful when 
 n is so small that the overhead, instead of the amount of storage or number 
 of operations, is critical. 
 
 Between these two extremes is a whole class of hybrid block methods 
 [5,6,13] which, though fairly complex to implement, offer a means of 
 achieving the relatively low overhead of band and envelope methods without 
 their large storage and work requirements. Generally, these methods 
 divide A into a number of independent blocks and obtain the factorization 
 of A by operating on the blocks and/or their factorizations. Storage 
 
3 
 references tend to be localized to the entries of the individual blocks, so 
 that block methods seem to be far better suited to paged storage systems 
 than are the other direct methods. Moreover, they may prove useful for 
 solving systems so large that auxiliary storage is required to store A or L. 
 
 In [5], George describes a block method for the solution of (l.l) 
 over a nine-point finite element grid. He divides A into blocks corresponding 
 to independent vertical regions of the grid, uses a band solver to factor 
 the blocks, and combines the factored blocks so as to obtain an implicit 
 form of L. This method might be termed block envelope with band Choleski 
 
 decomposition used in each independent block, and George shows that it 
 
 cr / p 7 / p 
 
 requires 0(n ; ) storage locations and 0(n ) multiplicative operations 
 
 (i.e., multiplications and divisions). The use of envelope Choleski, 
 
 instead of band Choleski , would yield modest reductions in the lead 
 
 coefficients, but not the orders, of the expressions for the storage and 
 
 work requirements of the method. 
 
 George's method could be applied directly to our model problem, 
 giving the same storage and work requirements as in [5l> but there is an 
 analog of his method which is substantially better. Instead of vertical 
 separators in the grid, diagonal separators are used to divide the grid 
 into independent regions, and this change leads to approximate asymptotic 
 savings of h0% in storage and 80% in work. 
 
 In this paper we present the details of the diagonal block method 
 and evaluate its usefulness through a theoretical and experimental analysis. 
 Despite the apparently substantial asymptotic savings over simpler methods, 
 we have not found that the use of either our method or George's is partic- 
 ularly worthwhile. Apart from some practical difficulties in programming 
 the methods, it appears that they suffer from a common malady among hybrid 
 
k 
 
 methods in that there is no range of problems for which they are the best 
 methods available. For n £ 50, both methods require more work than more 
 standard band or envelope methods. And for larger values of n, the actual 
 amount of work is critical, so that the higher overhead with general sparse 
 methods does not matter very much. It should be noted here that these 
 conclusions are based on the assumption that the work, not the storage, is 
 of primary importance. If this is not the case, then the use of either our 
 method or George's may be quite fruitful, although the recently developed 
 minimal storage Gaussian elimination algorithms may be even better, since 
 they require only 0(n ) storage locations [3,1*+ ]• 
 
 2. Description of the i Method 
 
 For some positive integer a << n , choose a - 1 lines of grid points, 
 or separators, parallel to the line y = x, in such a way that the semi- 
 perimeter of the grid is divided into roughly equal segments (see Figure 2.1), 
 
 Figure 2.1: n x n grid divided by separators for a = h. 
 These separators divide the grid into a distinct regions which are independent 
 in the sense that if x. and x. lie in different regions, then a. . = in 
 (1.1). 
 
 Two considerations lead to the choice of diagonal, rather than 
 vertical, separators. First, the sizes of the diagonal separators decrease 
 
near the corners of the grid, so that the average separator size is 
 approximately n/2 instead of n. Since a large part of the work is associated 
 with the separators, this reduction in size has a substantial effect on the 
 total work. Second, diagonal separators in a rectangular five-point grid 
 lead to matrix blocks which are diagonal matrices. 
 
 Once the grid has been divided into independent regions by the 
 diagonal separators, the regions and separators are numbered in the order 
 indicated by Figure 2.1. The nodes within each region are sequentiallv 
 ordered along diagonals parallel to the line y = -x. We call this method 
 of numbering the nodes the anti-diagonal ordering scheme, and it is an 
 analog of the so-called diagonal ordering scheme which is known to be quite 
 efficient for rectangular five-point grids [2,7]. The nodes in each separator 
 are ordered sequentially from the lower left to the upper right in the grid 
 (see Figure 2.2). 
 
 6 
 1+ 
 2 
 1 
 32 
 lU 
 
 12 
 11 
 58 
 
 I 
 
 * 9 
 
 
 10 36 2 
 
 9 30 31 
 
 
 5 
 
 7 
 
 35 
 
 
 26 
 
 27 
 
 28 65 
 
 / 
 
 
 3 3^ 
 
 
 23 
 
 2k 
 
 25 69 
 
 yT 
 
 55 
 
 33 
 
 
 20 
 
 21 
 
 22 63 
 
 
 52 
 
 53 
 
 
 IT 
 
 18 
 
 19 62 
 
 
 1+a 
 
 50 
 
 51 
 
 
 15 
 
 16 
 
 61 
 
 
 1+6 
 
 1+7 
 
 1+8 80 
 
 
 
 13 60 
 
 
 1+3 
 
 1+1+ 
 
 1+5 79 
 
 
 71+ 
 
 59 
 
 
 l+o 
 
 kl 
 
 1+2 78 
 
 
 70 
 
 72 
 
 
 
 
 
 
 
 
 
 ^6 
 57 
 56 
 51+ 
 81 
 76 
 75 
 73 
 71 
 
 37 38 39 77 67 68 69 
 Figure 2.2: Grid ordering for n = 9, a = k. 
 
 For the example in Figures 2.1 and 2.2, the matrix A has the 
 following block form: 
 
h 
 
 J 
 
 A = 
 
 B. 
 
 B, 
 
 b; 
 
 (2.1) 
 
 B, 
 
 Factoring A into the product LL we obtain: 
 
 L = 
 
 -T 
 
 c i L i 
 
 -T 
 B 2 L 2 
 
 -T 
 C 2 L 2 
 
 -T 
 
 B u L i, 
 
 -T 
 
 where (following George [5]) 
 
 A. = L. LT 
 1 11 
 
 A3 = A 3 -B 2 a; 1 B^ A" 1 C* 
 
 T 
 
 3 ~3 
 
 L. L 
 
 A. = A.-B. 
 
 -1 T 
 A, . B_: -F. 
 
 f: „-c. 
 
 -1 
 
 (2.2) 
 
 -T 
 
 B 6 L 6 
 
 i=l,2,k 9 6 (2.3a) 
 (2.3b) 
 
 1-1 A i-1 B i-r F i-2 F i-2" C i-3 A i-3 C i-3 = L i L i' i=5 ' T 
 
 f t = l: 1 Bi . 1 a:\ c t , , 
 i 1 1-1 1-1 i-i' 
 
 (2.3c) 
 i=3,5 (2.3d) 
 
 For a > U, the pattern of the last two rows and columns of A and L would 
 repeat. 
 
 Again following George [5], L is not obtained explicitly and only 
 the B's, C's, L's, and F's are stored. Given these matrices, the solution 
 of (1.1) with A given by (2.1) may be obtained as follows (it is assumed 
 that b is divided compatibly with the blocks of A): 
 
 
(i) Solve L. y. = b. , for i = 1,2,U,6. 
 
 (ii) Set b = "b 3 -B 2 L~ Yg-C^ L~ y^ and solve L^ y^ = b^ 
 
 ( iii} Set \ = b i- B i-l L x-1 y i-l- F i-2 y i-2- C i-3 L I-3 y i-3' and S ° 1Ve 
 
 L. y. = b. , for i=5,7. 
 11 1 
 
 T 
 
 (iv) Solve L x = y„. 
 
 _1 T T * 
 
 (v) (a) Set yg = Yg-Lg B g x , and solve Lg x^ = y^. 
 
 ip rp 
 
 (b) Set y = ycr- F c x t j and solve L x = y . 
 
 -1 T -1 T T 
 
 (vi) (a) Set y^ = y^-L^ C^ x^-L^ B^ x , and solve L^ x^ = yy 
 
 ^ m rp ^ 
 
 (b) Set y = yo -F o x c > and solve L x = y . 
 
 —1 T -1 T T 
 
 (vii) (a) Set y 2 = y 2 "L 2 C 2 x -L g B 2 x^ and solve L g x 2 = Vg. 
 
 "-IT T A 
 
 (b) Set y 1 = y 1 -L 1 (^ x , and solve 1^ x x = y 1 . 
 
 Within each numbered group of computations, the lettered computations may be 
 
 performed in any order or simultaneously. The solution scheme extends to 
 
 larger values of a in the obvious way. 
 
 3. Storage Requirements 
 
 In this section we analyze the storage requirements of the method 
 described in Section 2. Throughout the calculations we assume that a << n. 
 
 The independent A. ' s are quite similar to matrices derived from the 
 
 use of a five-point operator on long, thin, rectangular domains. For such 
 
 matrices it is well known [T,10] that envelope methods are more efficient 
 
 than band methods. Hence the independent A.'s and the corresponding L.'s 
 
 c l i 
 
 are stored and operated on as envelope matrices , so that the storage reauired 
 
 is given by 
 
 S p (a) = (-- ~r)n 3 + 0(n 2 ). (3.1) 
 
 B a 3a 2 
 
The A. ' s corresponding to the diagonal separators are diagonal 
 
 matrices, hut the corresponding A.'s and L.'s tend to be quite dense. Hence 
 
 11 
 
 their entire lower triangles are stored, and the storage required is given by 
 
 S 5 (o) = (|+ ^-)n 2 + o(na). (3.2) 
 
 The B.'s and C.'s are quite sparse, so a standard sparse matrix 
 storage scheme is used for them. The storage required is given by 
 
 S BC (a) = 2a n + 0(a). (3.3) 
 
 Finally, the F.'s are generally full. Therefore, the storage 
 required for them is given by 
 
 Sjo) = (§ + ~)r? + 0(no). (3.M 
 
 F 3 3a 
 
 Summing (3.l)-(3.M» the total storage required by the method is 
 
 S T (a) = ( a " "% )n3 + ( f )n2 + ° (n2) ' (3 - 5) 
 
 3a 
 
 This estimate is approximately minimized for a = a = /2n, and for n 
 
 sufficiently large, 
 
 S T (a Q ) = v^ n 5/2 + 0(n 2 ). (3.6) 
 
 By way of comparison, George [5] obtains a minimal storage estimate of 
 
 v6 n + 0(n ) , so that there is a significant saving in using diagonal 
 
 separators for the five-point operator. For the standard band and envelope 
 
 13 2 
 
 methods, the storage requirements are at least — n + 0(n ) [3], so again 
 
 there is a significant saving. Although (3.6) is an asymptotic result, 
 
 Table 3.1 shows that there are large savings in storage even for small 
 
 values of n. In fact, as we shall see, the storage requirement for the 
 
 method may be its primary virtue. 
 
n 
 
 a o 
 
 S T (a Q ) 
 
 1 3 
 2 n 
 
 8 
 
 2 
 
 267 
 
 256 
 
 16 
 
 h 
 
 1530 
 
 20U8 
 
 2k 
 
 6 
 
 1+177 
 
 6Q12 
 
 32 
 
 8 
 
 8592 
 
 1638U 
 
 Uo 
 
 8 
 
 1U82U 
 
 32000 
 
 U8 
 
 8 
 
 23392 
 
 552Q6 
 
 Table 3.1 
 Experimentally Determined Optimal Storage Requirements 
 
 h. Work Requirements 
 
 We now estimate the number of multiplicative operations required 
 to obtain the implicit block form of L. Again we assume that a << n. The 
 work can be broken into four sections which are treated separately: 
 
 (i) decomposition of the independent A.'s; 
 
 (ii) computation of the A.'s; 
 
 (iii) decomposition of the A.'s; 
 
 (iv) computation of the F.'s. 
 
 The decompositions of the A.'s corresponding to the independent 
 grid regions are performed as envelope Choleski decompositions [7,10]. 
 Results of George [7] can be used to compute the total amount of work for 
 the independent matrices. This is given by 
 
 W B (a) = -^n U + 0(^-). (U.1) 
 
 2a 
 
 The computation of the A.'s is very sensitive to the order in 
 
 which we perform the matrix operations. Following George [5], we obtain 
 
10 
 
 -1 T 
 terms of the form B. = B. , A. , B, hv first computing 
 
 i-1 l-l l-l l-l 
 
 b. . = l:\ lT 1 ., BV _ (U.2) 
 
 l-l l-l l-l l-l 
 
 and then computing 
 
 B i-i " B i-i B i-i • c*- 3 ' 
 
 The computation of B can be done as two envelope backsolves , and we can 
 
 take advantage of the sparseness of B. . to compute B. , from B. , in a 
 
 l-l l-l l-l 
 
 small number of additional multiplications. In this way all of the A.'s 
 can be computed in a total of 
 
 W s (a) = (rf + ^)n k + (f + |)n 3 ♦ O(^) (k.k) 
 
 1 3a 
 
 mult i pi i cations . 
 
 Once the A.'s have been obtained, they can be decomposed in a total 
 
 of 
 
 w s 2 (a) = ( 517 + 6 )n3 + 0( r» (t - 5) 
 
 additional multiplications, assuming that the A.'s are generally full, so 
 that sparse decomposition methods do not help much. 
 
 Using ideas similar to those of (U.2) and (U.3), the F.'s can be 
 obtained efficiently by first computing 
 
 and then computing 
 
 =i-i " L ili L I-i c i-i (1, - 6 > 
 
 F T = lT 1 c. .. (U.T) 
 
 l l l-l 
 
 Using this technique and noting that the terms C. also appear in the 
 
 computation of the A.'s, we find that the number of additional multiplications 
 
 needed to compute the F.'s is given by 
 
 1 3 
 
 W F (a) = (f - |)n 3 + O(jj-). (U.8) 
 
11 
 
 Summing and simplifying, the total number of multiplicative 
 operations required for the implicit decomposition of A is given "by 
 
 W T (a) = (3| + ^ + (|f + l^ + 0(4. 
 
 6a 
 
 (U.9) 
 
 W (a) is approximately minimized for a = a 
 
 Pn 
 5 
 
 , and 
 
 W_(ol) = i fio n T/2 + 0(n 3 ) . 
 T 1 3 
 
 For a = a , (3.5) yields 
 
 S T (a 1 ) = ^ v^O n 5/2 + 0(n 2 ), 
 and conversely, for the a which minimized storage, the work is 
 
 3 
 
 7/2 
 
 W (a ) = f J2n Ud + 0(n J ). 
 
 (U.io) 
 
 (i+.n) 
 
 (U.12) 
 
 T 0' 
 
 So there is little to choose between minimizing the work and minimizing 
 the storage. 
 
 Both (1+.10) and (1+.12) compare favorably with George's work estimate 
 of 2 /lO n + 0(n ) [5]. However, all these results are valid only in the 
 
 asymptotic case of large n. For values of n £ 50, we have found that the 
 minimal amount of work is always at least n /k (see Table l+.l). 
 
 n 
 
 a Q 
 
 S T (a Q ) 
 
 a l 
 
 W T (« 1 ) 
 
 - n k /k 
 
 8 
 
 2 
 
 267 
 
 h 
 
 3hfh 
 
 1021+ 
 
 16 
 
 1+ 
 
 1530 
 
 8 
 
 38810 
 
 163BU 
 
 2k 
 
 6 
 
 Ul77 
 
 10 
 
 159^96 
 
 82 Q U1+ 
 
 32 
 
 8 
 
 8592 
 
 12 
 
 U29605 
 
 2621 1+1+ 
 
 1+0 
 
 8 
 
 1U82U 
 
 12 
 
 928186 
 
 61+nooo 
 
 1+8 
 
 8 
 
 23392 
 
 lU 
 
 17 396 1+2 
 
 1327101+ 
 
 Table l+.l 
 Experimentally Determined Optimal Storage and Work Requirements 
 
12 
 
 5. Conclusions 
 
 In conclusion we make the following remarks: 
 (i) In terms of asymptotic storage requirements, our method (like 
 
 George's [5]) has a great advantage over the standard band or envelope 
 storage schemes. In practice, this holds for even very small values of 
 
 n (see Table 3.1). However, the minimal storage methods [3,1*+] require 
 
 o 
 only 0(n'~) storage locations, and recent research [9,1^,1^] indicates 
 
 that general sparse methods may be implemented so as to require less 
 
 storage than our method, even for small values of n. 
 
 (ii) Asymptotically, our method is an improvement over standard band 
 
 or envelope methods in terms of the number of multiplicative operations , 
 
 and it might be hoped that it would be competitive with general snarse 
 
 matrix methods for some range of n. However, for moderate values of n 
 
 (say n £ 50), for which it might not be worthwhile to use a general snarse 
 
 matrix method, the method appears to be no better than an envelope method, 
 
 except, perhaps, in terms of the work required for backsolution. In fact, 
 
 Schultz [15] has proposed a very simple block envelope scheme which 
 
 1 ) "5 
 
 requires only 77 n + 0(n ) multiplicative operations for a symmetric n x n 
 five-point finite difference grid. Hence it appears that where the number 
 of arithmetic operations is critical, there is little use for our method. 
 (A similar qualitative statement may be made concerning George's block 
 method for nine-point finite element grids [5j Q ]K 
 
 (iii) From (i) and (ii) it may be concluded that our method is not 
 generally useful for the direct solution of (l.l). However, if (l.l) is 
 to be solved for a number of different right hand sides, the method may 
 prove quite efficient because it substantially reduces the storage require- 
 ments and the number of arithmetic operations needed for backsolution. 
 
13 
 Minimal storage methods are not efficient in this case, since they do not 
 save a factorization of A. Typical examples of such effective use of the 
 method might arise in the solution of semi-linear or nonlinear partial 
 differential equations using certain iterative techniques for which 
 backsolving is the critical operation (see [h]) . 
 
lit 
 
 REFERENCES 
 
 [ l] G. Birkhoff and J. A. George. "Elimination by Nested Dissection." 
 
 In J. F. Traub, Ed. , Complexity of Sequential and Parallel Numerical 
 Algorithms . New York" Academic Press, 1973, pp. 221-260. 
 
 [ 2] E. Cuthill and J. McKee. "Reducing the Bandwidth of Sparse Symmetric 
 Matrices." Proc . ACM National Conference, 1°6Q, pp. 157-172. 
 
 [ 3] S. C. Eisenstat, M. H. Schultz, and A. H. Sherman. "Appli cations of 
 an Element Model for Gaussian Elimination." In Proceedings of the 
 Symposium on Sparse Matrix Computations, Argonne , IL , 1075. 
 
 [ k] S. C. Eisenstat, M. H. Schultz, and A. H. Sherman. "The Application 
 of Sparse Matrix Methods to the Numerical Solution of Nonlinear 
 Elliptic Partial Differential Equations." In A. Dold and B. Eckmann, 
 E ds . , Proceedings of the Symposium on Constructive and Computational 
 Methods for Differential Equations . Berlin : Snringer-Verlag, 197^, 
 pp. 131-153. 
 
 [5] J. A. George. "An Efficient Band-Oriented Scheme for Solving n x n 
 Grid Problems." FJCC, 1972, pp. 1317-1320. 
 
 [6] J. A. George. "Block Elimination on Finite Element Systems of 
 
 Equations." In D. J. Rose and P.. A. Willoughby, Eds. , Sparse Matrices 
 and Their Applications . New York: Plenum Press, 197?, pp. 101-llh. 
 
 [ 7l J- A. George. "Computer Implementation of the Finite Element Method. " 
 Stanford University Computer Science Department, Technical Report 
 STAN-CS-71-208, 1971. 
 
 [ 8] J. A. George. "Nested Dissection of a Regular Finite Element Mesh." 
 SIAM J. Num. Anal. 10 (1973), pp. 3^5-363. 
 
 [9] J. A. George. Private communication. 
 
 [10] A. Jennings. "A Compact Storage Scheme for the Solution of Symmetric 
 Linear Simultaneous Equations." Computer J. 9 (1966), pp. 281-28S. 
 
 [ll] R. S. Martin and J. H. Wilkinson. "Symmetric Decomposition of Positive 
 Definite Band Matrices . " Handbook Series in Linear Algebra, Num. Math. 
 7 (1965), PP. 355-361. 
 
 [12] D. J. Rose. "A Graph-theoretic Study of the Numerical Solution of 
 
 Sparse Positive Definite Systems of Linear Equations." In Read, R.C., 
 Ed., Graph Theory and Computing . New York: Academic Press, 1 Q 72, 
 pp. 183-217. 
 
 [13] D. J. Rose and J. R. Bunch. "The Role of Partitioning in the Numerical 
 Solution of Sparse Systems." In D. J. Rose and R. A. Willoughby, Eds., 
 Sparse Matrices and Their Applications . New York: Plenum Press, 1972, 
 pp. 177-187. 
 
15 
 
 [ik] A. H. Sherman. "On the Efficient Solution of Snarse Systems of Linear 
 and Nonlinear Equations." Ph.D. dissertation, Yale University 
 Department of Computer Science, 1975- 
 
 [15] M. H. Schultz. Private communication. 
 
 [l6] G. F. Whitten. Private communication. 
 
 [IT] J. H. Wilkinson. The Algebraic Eigenvalue Problem . London: 
 McGraw-Hill, 1971. 
 
FormAEC-427 U.S. ATOMIC ENERGY COMMISSION 
 
 "JlrSoi UNIVERSITY-TYPE CONTRACTOR'S RECOMMENDATION FOR 
 
 A DISPOSITION OF SCIENTIF.C AND TECHNICAL DOCUMENT 
 
 ( See Instructions on Reverse Side ) 
 
 |. AEC REPORT NO. 
 
 C00-2 383-0026 
 
 2. TITLE 
 
 Block Envelope Solution of Finite Difference Systems 
 
 3. TYPE OF DOCUMENT (Check one): 
 
 KJa. Scientific and technical report 
 J b. Conference paper not to be published in a journal: 
 Title of conference 
 Date of conference 
 
 Exact location of conference. 
 
 Sponsoring organization 
 
 □ c. Other (Specify) 
 
 4. RECOMMENDED ANNOUNCEMENT AND DISTRIBUTION (Check one): 
 
 KXa. AEC's normal announcement and distribution procedures may be followed. 
 J b. Make available only within AEC and to AEC contractors and other U.S. Government agencies and their contractor*. 
 "2 c. Make no announcement or distribution. 
 
 5. REASON FOR RECOMMENDED RESTRICTIONS: 
 
 6. SUBMITTED BY: NAME AND POSITION (Please print or type) 
 
 C. W. Gear 
 
 Professor and Principal Investigator 
 
 Organization 
 
 
 
 
 Department 
 
 of 
 
 Computer Science 
 
 
 University 
 Urbana, IL 
 
 of 
 
 Illinois at Urban a- Champaign 
 61801 
 
 
 Signature^ y 1 
 
 
 
 Data 
 
 (/lr 
 
 _^ 
 
 A ^ , 
 
 November 1975 
 
 V FOR AEC USE ONLY 
 
 • AEC CONTRACT ADMINISTRATOR'S COMMENTS, IF ANY. ON ABOVE ANNOUNCEMENT AND DISTRIBUTION 
 RECOMMENDATION: 
 
 1. PATENT CLEARANCE: 
 
 O a. AEC patent clearance has been granted by responsible AEC patent group. 
 LJ b. Report has been sent to responsible AEC patent group for clearance. 
 LJ c. Patent clearance not required. 
 
 i 
 
BIBLIOGRAPHIC DATA 
 SHEET 
 
 1. Report No. 
 
 UIUCDCS-R- 75-765 
 
 3. Rcc ipient's Acci ssi m '. 
 
 4. I I! It and Mibt it l( 
 
 5. Kcport Date 
 
 November 1975 
 
 Block Envelope Solution of Finite Difference Systems 
 
 7. turhnn.s) 
 
 Andrew H. Sherman 
 
 8. Performing Organi/.nion Re pi 
 
 No -UIUCDCS-R-75-765 
 
 9. iv.tuii.mj. Organization Nairn- and Address 
 
 Department of Computer Science 
 
 University of Illinois at Urban a- Champaign 
 
 Urban a, IL 6l801 
 
 10. Project/Task/Work Unit No. 
 
 1 1. Contrac t /Grant No 
 
 US AEC AT(ll-l) 2383 
 
 1 ' ni uns >ring Organization Name and Address 
 
 US AEC Chicago Operations Office 
 98OO South Cass Avenue 
 Argonne, IL 6oi+39 
 
 13. Type of He port & Pi riod 
 Covered 
 
 14. 
 
 i| Icrrv ntary Notes 
 
 16 \ b m 1 a 1 1 ; 
 
 In this paper we describe an envelope-oriented block Gaussian 
 elimination method for solving linear systems arising from the use of 
 
 five-point finite difference approximations on n x n square grids. We 
 
 5/2 7/2 
 
 show that the method requires 0(n ) storage locations and 0(n ) 
 
 arithmetic operations. Asymptotically this is a significant improvement 
 
 on the costs of ordinary band and envelope methods, but we find that 
 
 for moderate values of n (say, n < 50) , the band and envelope methods 
 
 are superior. 
 
 17. key Words and Document Analysis. 17a. Descriptor 
 
 ptors 
 
 Sparse linear systems 
 Block Gaussian elimination 
 Envelope methods 
 
 i7b. I !.-••;! i,rs Open-Fnded Terms 
 
 7c ( OS ATI I -'ie Id /Group 
 
 18. A\ liability Statement 
 
 Unlimited 
 
 'f*M N T!5.i)5 ( 10-70) 
 
 19. Security Class (This 
 Report ) 
 
 UNCLASSIFIED 
 
 20. Security (lass (This 
 Page 
 
 UNCLASSIFIED 
 
 22. Pri< e 
 
 i 
 
 21. No. of Page -. 
 
 20 
 
 USCOMM-DC 40329-P 
 
> 
 
 ^ 
 
 ^ 
 
,o^ N o«C 
 
 HW