HI . UIUCDCS-R-78-918 ^ z "~^^^<^l UILU-ENG 78 1712 USE OF THE SINGULAR VALUE DECOMPOSITION TO INCREASE EXECUTION AND STORAGE EFFICIENCY OF THE MANTEUFFEL ALGORITHM FOR THE SOLUTION OF NONSYMMETRIC LINEAR SYSTEMS by Paul E. Say lor April 1978 e-^ ^ Ubraiy 0' the m ' ^^ diversity 01 .lunois m.na-Ch?' DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN URBANA, ILLINOIS USE OF THE SINGULAR VALUE DECOMPOSITION TO INCREASE EXECUTION AND STORAGE EFFICIENCY OF THE MANTEUFFEL ALGORITHM FOR THE SOLUTION OF NONSYMMETRIC LINEAR SYSTEMS by Paul E. Saylor April 1978 Department of Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois 61801 This work was supported by NSF Grant 76-81100. Digitized by the Internet Archive in 2013 http://archive.org/details/useofsingularval918sayl Introduction . The main concern here Is the efficient use of the Manteuffel algorithm [ 9] for computing Chebyshev parameters to accelerate the Iterative solution of real nonsymmetrlc linear systems. There are two objectives. One Is to determine when to call the Manteuffel algorithm; the other Is to reduce the costs of execution and storage. A consequence of knowing when to call the algorithm Is that the iteration takes fewer steps. Apart from these, it is also shown that the Manteuffel algorithm may be used with the Jacobl semi-iterative (JSI) method as well as Richardson's method for which it was originally developed. Chebyshev acceleration parameters are computed from the convex hull of eigenvalues of the matrix associated with the linear system. Eigen- values are obtained from the power method, which, for an arbitrary square matrix B with a single dominant eigenvalue, is based on the observation that the Krylov sequence y,...,B y tends to that eigenvector corresponding to the dominant eigenvalue. Precisely one nonreal eigenvalue of a real matrix cannot be dominant because eigenvalues and the corresponding eigen- vectors appear in conjugate pairs. A dominant nonreal eigenvalue would cause B y to be a linear combination of two eigenvectors, one the conjugate k k+l k+2 of the other. If so, B y, B y and B y are linearly dependent, for large enough k. The coefficients expressing linear dependence, n^ R^ _^ ^k+1 ^k+2 (1) CqB y + c^B y = -c^B y, are seen to be the coefficients of a polynominal whose roots are the dominant eigenvalue and its conjugate. A least squares solution of the overdetermlned system (1) yields c„, c , and c„ [13, p. 580]. The details are in Section (5). The computation is less expensive and takes less storage if fewer, k k+1 k+2 randomly selected components of B y, B y, B y are used, which is equivalent k k+1 k+2 to replacing B y, B y, B y in the overdetermined system by their pro- k k+1 k+2 jections, P(B y) , P(B y), P(B y) onto a subspace. The projections must also be linearly dependent. The linear dependence of a set of vectors, {v , ...,v }, is equivalent to the existence of a zero singular value of the matrix (v- . . .v ) . k k+1 k+2 The question of whether P(B y) , P(B y) and P(B y) yield satisfactory eigenvalues reduces to a test of the smallest (in magnitude) singular value of the matrix, D, with these column vectors. Other tests are T possible. As an alternative one could determine the rank of D D (suggested by Manteuffel [10, p. 19] for a slightly different application) but an efficient, stable algorithm due to Businger-Golub [1] is available to compute singular values and is generally superior [10, p. 382]. If the smallest singular value is too large, that is if the column vectors of (2) ' D = (P(B^), P(B^"^^y), P(B^^2)) are independent, there is no information in the singular values to suggest whether it is because k should have been larger or because the projection was onto a badly chosen subspace. The smallest singular value is like clinical temperature, which may show a patient is sick, but gives no diagnosis. In practice, rather than test the smallest (in magnitude) singular value to determine linear dependence, the ratio of the smallest (in magnitude) to the largest (in magnitude) should be tested. If there is linear dependence, the next step is to solve the least squares problem, which may be done at no additional cost by using some of the results from the computation of the singular values (details in section 8). I Current implementations of the Manteuffel algorithm compute new eigenvalues after a fixed number of steps of the power method, a costly strategy if the eigenvector estimates are mature after fewer steps. Linear dependence should be monitored in order to compute new eigenvalues precisely when they would be accurate. If the number of rows of D is small, it would be economical to compute singular values to do this after each iterative step so new eigenvalues could be computed precisely when they are available, resulting in an efficient iteration. Computing singular values to monitor linear dependence thus yields two benefits: lower overhead in executing the Manteuffel algorithm; and fewer iterative steps. The basic facts about Richardson's method are presented in section 2, with the techniques to estimate iteration parameters dynamically (the Manteuffel algorithm) explained briefly in section 3. In section 4, it is shown how to apply the Manteuffel algorithm to the Jacobi semi- iterative method and to a generalization of Richardson's method which in- cludes the JSI method. Section 5 is a discussion of the power method for nonreal eigenvalues. A simple way to reduce the labor by working with fewer components, recommended by Wrigley [14], is explained in the next section. Whether fewer components will be effective reduces to a study of linear dependence by using the singular values of matrix D, above in (2) (section 7) . The computational work of the singular value decomposition and how it applies to computing eigenvalues are given in section 8. 2. Richardson's Method . The basic iteration is a gradient pro- cedure (3) x^^-^l^ = x^^^ - t (Ax^^> - b). n The true error, defined to be e = x - x , satisfies where e<"> - (I - t ,A)e("-l> = R (A)e"> n-1 n R (A) = (1 - t ,X)...a - t^X), n n— i U called the error propagator polynomial, is such that ( 5 ) R (0) = 1 . n For A nonsymmetrlc with eigenvalues in, say, the right half plane, the gradient procedure converges if the error propagator is P (A) = T [(d - X)/c]/T (d/c), n n n where T is the Chebyshev polynomial of degree n. The resulting iteration is called Richardson's method. Parameters d and c are the center and focal length respectively (d + c are the foci) of a family of ellipses such that one member contains the eigenvalues of A and lies in the right half plane. d-c d + c Figure 1. The eigenvalues of A must lie in an ellipse with center d and foci d + '^ contained in the right (or left) half plane. The minimax polynomial is that polynomial, subject to (5), whose maximum over any ellipse of the family is less than or equal to the maximum of any such polynomial over that ellipse. The Chebyshev error propagator either is the minimax polynomial or else approximates it with negligible error [9, p. 315]. The Chebyshev propagator depends on the center and focal length of the enclosing ellipses. Optimal values of the center and focal length are determined by the convex hull of the eigenvalues, which may be computed dynamically by the power method, as explained next. (Richardson's method in the form above, (3), is awkward and unstable but convenient for discussion. Another version of the iteration [9, p. 316] is recommended in practice.) 3. Dynamic Estimation of the Eigenvalues . The residual error at step n, r^'^^ =b - Ax^"^ = Ae^^\ satisfies analogous to (4) for the true error. For large n, P (X) = [M(X) ] where n M(X) = {d - X + [(d - X)^ - c^J-'-^^J/Ed + (d^ - c^)-'"''^], an approximation n cosh "-^w which follows from T (w) = e /2. Therefore, n r^^> = [M(A)]\^°\ Vectors r thus are the iterates of the power method applied to matrix M(A) . Suppose y, and y^ are the dominant eigenvalues of M(A) corresponding to eigenvectors v and v , which are also eigenvectors of A. Then r is approximately a linear combination of v^ and v from which y^ may be computed. The details of the power method are given in section (5). An eigenvalue, X , of A is the solution of M = M(X ) . An algorithm to compute eigenvalues dynamically and then to compute optimum d and c parameters from the eigenvalues has been developed by Manteuffel [8,9,10] (cf. [7,14]). 4. The Jacobl Semi-Iterative Method . The Manteuffel algorithm estimates optimum d and c parameters from the convex hull of eigenvalues, which it computes dynamically by the power method. The algorithm may be applied to any iteration for which the error propagator polynomial trans- forms into the error propagator polynomial for Richardson's method, if, also, the iteration yields vectors to which the power method may be ap- plied. An example is the Jacobl semi-iterative method, described below, also called the Chebyshev polynomial method [ 6, P-22;15> p.355]. Let A = D - L - U, where D is a diagonal matrix, L is strictly lower triangular, and U is strictly upper. The Jacobl method is If J = D """(L + U), for s = D """b. Dx^'^-'l^ = CL + U)x'" + b. ^(k+1) __ j^(k) ^ ^^ The Jacobl method converges slowly for essentially all practical problems. To accelerate it, the k approximation may be taken to be a linear combination of the first k Iterates, The coefficients must satisfy the consistency condition ( 6 ) ^ \i = 1 j=o ^J from which it follows that X- y(^) = Z a, . (X- x^'^^ = P, (J) (x - x^°^ . The power method cannot be applied using this expression because x is not known, but the same relation holds for a computable quantity, the residual. defined by ^(k+i) ^ j^(k) ^ ^ . ^(k) Assume the eigenvalues of J are enclosed in one member of the family of ellipses with center d'and foci d'+ cJ Among all polynomials with coefficients satisfying (6), the minimax polynomial over any such ellipse either is P^^^zO E T^[(z'- d')/cV\[(l - d')/c'], or is closely approximated by P . If P is the error propagator polynomial, the iteration is called the Jacobi semi-iterative method (JSI) or the Chebyshev polynomial method. The consistency conditions for the JSI method and Richardson's method respectively are p^^' (1) = 1, P^(0) = 1 The error propagators transform into one another as follows. To assume that the eigenvalues of A lie in the right half plane is equivalent to assuming that the eigenvalues of J lie to the left of the vertical line through z' = 1. (Figure 2a) . Let E' be an ellipse in the plane left of z' = 1. d'+c' y-plane Figure 2a. Jacobi semi-iterative d-c X-plane Figure 2b. Richardson's method LetA=l- X', d=l-d' andc=c'. Then d^c>Oifd'+c'_ 0. The error at step k obeys the same relation as in equation (4) , e^-^' = \^l(M-^A)e«'>. where k R^^^(X) = .Jg (1 - t.X) In order to apply the Manteuffel algorithm, the eigenvalues of M A must lie in a half plane. An example of a splitting is one obtained by the approximate factorization of the strongly implicit procedure (SIP) [12]. For the JSI method, M = I, N = J, and M~ A = I - J. 5. The Power Method for a Nonsymmetrlc Matrix . Let B be an N X N nonsymmetrlc real matrix with a dominant complex eigenvalue X corresponding to eigenvector v- . Complex eigenvalues and eigenvectors occur In conjugate pairs. Assume for convenience that B Is nondefectlve, that Is, any vector y may be expressed as a sum of eigenvectors, y = 6^v^+3^v^+ ... k If 3-, + 0, multiplication by B yields a vector, ( 7 ) B^y = B^ ^K^ + e^X^^3^ + . . . In which the dominant term is a sum of two eigenvectors. An estimate k k+1 k+2 of X may be computed from B y and two additional terms, B y and B y, by a method [14, p. 174] which follows from the observation that if c and c. are the coefficients of the polynomial p^(X) = (X - X^ ) (X - X ) = c^ + c A + X then ( 8 ) • CqB y + c^B y = - B y. Values of c and c- that make the left side nearly equal the right are therefore approximate values of the coefficients of p_(X). This is an overdetermlned system in two unknowns, c_ and c , a solution of which may be computed from the method of least squares. For convenience, only the computation of X^ and X are discussed here. In practice the power method is used to compute at least four nonreal eigenvalues at a time [8, p. 17]. k k+1 Let D = (B y, B y) be the N x 2 matrix of system (8). Thus, '^l The least squares solution is the solution of the 2x2 system D D ( ) = - D B y. ^1 10 T Four inner products are required to compute D D. The number of multlplicatons is 4(N-1). The discussion of the power method in this section has been sim- plified by considering only two nonreal eigenvalues. In practice the power method is used to compute at least four nonreal eigenvalues at a time [10, p. 18]. 6. A Reduction in the Work to Solve the Least Squares Problem . A subsystem of equations could be selected from the overdetermined system ( 8 ) at random, and the method of least squares applied to the sub- system to reduce the work, but the solution must not be grossly perturbed. In this instance the number, N, of equations, or statistical observations, is so much greater than the number, 2, of parameters, c^^ and c , that one can reasonably expect variations about the true value of the parameters to be uniform for a smaller number of observations than N. This is an intuitive reason why the method of least squares could be applied to a subsystem. A closer look may be more convincing. Let V, be the i component of v^ ; let rn* n.'> and n." be the X 1 1 1 1 •th ^ „k „k+l J „k+2 . , ^ ^ 1 components of B y, B y, and B y respectively. From the equations ^^\\^ + '^^\^^^ + . . . = B^y . , k+1 ,- - k+1- , _k+l 11 "^1 11 V, + . . . = B y „ , k+2 ^ - T k+2- ^ „k+2 ^1^1 ""l ^1^1 v^ + ... = B y it follows that 11 where C^^. C^'. and C^" are the contributions due to low order terms. If c^ and c are, as before, the coefficients of p^iX) = (X - A^)(X - I^) = Cq + c^X + X^, then Any two equations uniquely determine c^^ and c , If the right sides, due to low order terms, are small, then the solution of (10) Vi ■" h\' ■" \" = ° Vj^Vj' ^^j" = ° approximates (c ,c ). Even though the low order terms are small in B y, they are not necessarily small component by component. It is a pre- caution against this to include more equations in the system (lO) deter- k k k— mining c and c^ . Since B y is nearly 3 X v-|+ 3, X, v, components of the low order terms are generally small, although, of course, it may happen k k— k that only one component of 3,X v + 3,X v is dominant in B y. In other words, a system Vi "^ ^i\' = " "^i 1 1 1 ^o\ ^ ^i\ = " "^i ^ I i I of fewer than N equations would probably yield a satisfactory least squares solution. With a smaller system, the storage for the power method is also reduced. 12 7. A Test for the Size of the Subsystem . Whether the number of equations is adequate may be determined in a precise way. If there are no T T low order terms the column vectors u = (n. ,...,n. ) ,u' = (n! ..•.,n!) , ^1 ^a 1 i T and u" = (n!' .•••,n'' ) are linear dependent. Therefore the matrix ^1 I (11) C = (u u' u") has a singular value zero. In reality, the columns are linearly in- dependent with a nonzero singular value of smallest magnitude. The larger this singular value is, the more the lower order terms dominate in (9 ). If the system is small, it would be economical to compute the singular values as a check on accuracy. The work of computing the singular values could be applied to the solution of the least squares problem, a savings which is explained in the next section. 8. The Singular Value Decomposition . Any m x n matrix, C, may be decomposed in the form C = U E V where U is an m x m orthogonal matrix, E is an ra x n diagonal matrix, the entries of which are the singular values, and V is an n x n orthogonal matrix [ 4, 11] . This is called the singular value decomposition (SVD) . The ratio of the largest (in magnitude) singular value to the smallest is a measure of the linear independence of the column vectors of C. The SVD reduces the question of linear independence to a single quantity, the ratio of the extreme singular values. Let C = (u u' u") be the 1x3 matrix (11) resulting from the power method. The steps to compute the SVD form two units. First, C is reduced to bidiagonal form in which the only nonzero elements lie on 13 the main diagonal and the superdiagonal (just above the main diagonal) [ 4, 3,11]. A modification of the QR algorithm, due to Golub [11, p. 387;5], may then be used to reduce the bidiagonal matrix to diagonal form. (Other methods are possible but this one is stable and widely accepted.) Reduction of C to bidiagonal form may be made by Householder transformations. A Householder transformation is a sjrmmetric and orthogonal T T matrix of the form P = I - 2ww , where w w = 1, which may be constructed in such a way as to transform a given vector z, whose first component must be nonzero, into Pz, each component of which, except the first, is zero. The formulas for P are: T o = z z; T s = z + ae , where e^ = (1,0,..., 0); B = s^s/2; T T P = I - ss /3 = I - s s . /6"/3 They yield Pz = oe , More details may be found in the text of Stewart [H , p. 230 ]. If operation means either a multiplication or a division, hi operations are required for 1/2 a, 3, and division of s by 3 in the formula for P. The actual reduction to bidiagonal form is by a sequence of House- holder transformations multiplying C on the left and on the right. Those on the left eliminate nonzero elements under the diagonal, whereas those on the right eliminate nonzero elements above the superdiagonal. Let P-,P ,P_ denote the transformations on the left and Q_ the transformation on the right. These yield, where x denotes a possibly nonzero element: v = 14 XXX X X X X S= ^^0*^ Cj-PjPiPqC and '^A = ^2^V '^0 15 In order to use the results of the reduction for the solution of the least squares problem, the matrix, D, of which consists of the first two columns of C, it is convenient not to multiply by Q» before multiplication by P and P . The first two columns of C = P-|PqC will then be the least squares matrix, D, reduced to upper triangular form, from which the solution of the least squares problem may be computed at no expense. Construction of Q and multiplication on the right by Q^ are negligible since Q is a 3 x 3 matrix. Construction and multiplication on the left by Pq.P.. and P2 take lOil, 81, and 61 operations respectively. The total number of operations to compute C, is 24£. The further reduction of C, to diagonal form is not discussed. 9. Conclusion . Richardson's method and a generalization of Richardson's method that includes the Jacobi semi-iterative method employ Chebyshev acceleration parameters determined by the convex hull of eigen- values of the error propagator matrix. The Manteuffel algorithm, which estimates parameters dynamically, performs two basic operations: It computes the convex hull by the power method, which requires a least squares procedure; and it computes the best parameters from the convex hull. Computation of the best parameters is inexpensive once the eigen- values are known. The cost of the Manteuffel algorithm is in the solution of the least squares problem. Examination of a method to reduce execution time and storage requirements for the least squares problem by using fewer components of B y shows that the method works if the columns of matrix C, (11), are linearly dependent. The singular values of C may be used to test linear dependence and at the same time provide the solution of the least squares problem. This summarizes the report through the last section. 16 If fewer components of B y are used, there are fewer rows of C and the singular values are inexpensive to compute. Singular values have allowed the use of fewer components, but now fewer components allow the free use of singular values. Inexpensive singular values could be computed after each iteration to test whether new eigenvalues and therefore improved i acceleration parameters are ready. Testing singular values displaces the ; old strategy of v/aiting a fixed number of iterations before improving acceleration parameters. The Manteuffel algorithm now falls in step with the iterative method further enhancing convergence. 17 REFERENCES 1. P. A. Businger and G. H. Golub, Algorithm 358, Singular value de- composition of a complex matrix, Comm. ACM 12, 1969, pp. 564-565. 2. Paul Concus, Gene H. Golub, and Dianne P. O'Leary, A generalized conjugate gradient method for the numerical solution of elliptic partial differential equations, in Sparse Matrix Computations, edited by James R. Bunch and Donald J. Rose, Academic Press, New York, 1976. 3. George Forsythe, Michael A. Malcolm, Cleve B. Moler, Computer Methods for Mathematical Computations, Prentice-Hall, Englewood Cliffs, New Jersey 07632, 1977. 4. G. Golub and W. Kahan, Calculating the singular values and pseudo- inverse of a matrix, J. SIAM Numer . Anal., Ser. B, Vol. 2, No. 2, 1965. 5. G. H. Golub and C. Reinsch, Singular value decompositions and least squares solutions, Numer. Math. 14, 1970, pp. 403-420. 6. L. W. Hageman, The Estimation of Acceleration Parameters for the Chebyshev Polynomial and the Successive Overrelaxation Iteration Methods, Memorandum prepared for USAEC by Westinghouse Electric Corporation, June, 1972, Available from National Technical Information Service, U.S. Department of Commerce, 5285 Port Royal Road, Springfield, VA 22151. 7. G. Kjellberg, On the convergence of successive over relaxation applied to a class of linear systems of equations with complex eigenvalues, Ericsson Technics, No. 2, 1958, pp. 245-258. 8. T. A. Manteuffel, An Iterative Method for Solving Nonsymmetric Linear Systems with Dynamic Estimation of Parameters, Rep. UIUCDCS-R-75- 758* University of Illinois, U-C, Urbana, Illinois 61801, Oct., 1975. 9. T. A. Manteuffel, The Tchebyshev iteration for nonsymmetric linear systems, Numer. Math. 28, 1977, pp. 307-327. 10. T. A. Manteuffel, Adaptive Procedure for Estimating Parameters for the Nonsymmetric Tchebyshev Iteration, Sandia Laboratories Report SAND77-8239, Livermore, CA, June 1977. 11. G. W. Stewart, Introduction to Matrix Computation, Academic Press, New York, 1973. 12. H. L. Stone, Iterative solution of implicit approximations of multi- dimensional partial differential equations, SIAM J. Numer. Anal., 5 (1968), pp. 530-558. 18 13. J. H. Wilkinson, The Algebraic Eigenvalue Problem, Clarendon Press, Oxford, 1965. 14. H. E. Wrigley, Accelerating the Jacobi method for solving simultaneous equations by Chebyshev extrapolation when the eigenvalues of the iteration matrix are complex. Computer Journal, Vol. 6, July, 1963, pp. 169-176. 15. David M. Young, Iterative Solution of Large Linear Systems, Academic Press, New York, New York, 1971. ILIOGRAPHIC DATA fET 1. Report No. UIUCDCS-R- 78-918 itle and Subtitle USE OF THE SINGULAR VALUE DECOMPOSITION TO INCREASE EXECUTION AND STORAGE EFFICIENCY OF THE MANTEUFFEL ALGORITHM FOR THE SOLUTION OF NONSYMMETRIC LINEAR SYSTEMS 3. Recipient's Accession No. 5> Report Date April 1978 uthor(s) Paul E. Say lor 8' Performing Organization Kept. No. erforming Organization Name and Address Department of Computer Science University of Illinois at Urbana-Champaign Urbana, IL 61801 10. Project/Task/Work Unit No. 11. Contract/Grant No. NSF 76-81100 ;Sponsoring Organization Name and Address National Science Foundation Washington, D.C. 13. Type of Report & Period Covered 14. JSupplementary Notes ^Abstracts Optimum Chebyshev parameters may be computed dynamically by the Manteuffel algorithm for use with a generalization of Richardson's iterative method and the Jacobi semi-iterative method to solve nonsymmetric linear systems. Para- meters are determined by the algorithm from the convex hull of the eigenvalues of the error propagator, which the algorithm also computes by the power method. The singular value decomposition may be used to test the reliability of a sim- ple technique to reduce execution and storage costs of the power method. The same technique also yields an inexpensive singular value decomposition making it feasible to determine precisely when to call the Manteuffel algorithm. 'Key Words and Document Analysis. 17a. Descriptors linear algebraic equations, iteration, eigenvalues, eigenvectors, niimerical analysis Identifiers/Open-Ended Terms COSATl Field/Group Availability Statement 19. Security Class (This Report) 20. Security Class (This Page UNCLASSIFIED 21. No. of Pages 22. Price •M NTIS-35 ( 10-70) USCOMM-DC 40329-P7 1 JUH n m* I ^f 1