LIBRARY OF THE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAICN 510.84 IZ6T no. 758-759 cop. 2 The person charging this material is re- sponsible for its return to the library from which it was withdrawn on or before the Latest Date stamped below. Theft, mutilation, and underlining of books are reasons for disciplinary action and may result in dismissal from the University. To renew call Telephone Center, 333-8400 UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN MAY 2 8 1182 L161— O-1096 Digitized by the Internet Archive in 2013 http://archive.org/details/iterativemethodf758mant :> /r.xy -758 AN ITERATIV! 'OR SOLVIT RIC LINKER SYSTEMS WITH DYNAMIC ESTIMATIO] by Tliomas Albert Manteuffel October 1975 uiucdcs-r-75-758 AN ITERATIVE METHOD FOR SOLVING NONSYMMETRIC LINEAR SYSTEMS WITH DYNAMIC ESTIMATION OF PARAMETERS by Thomas Albert Manteuffel October 1975 Department of Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois 6l801 This work was supported in part by the Department of Computer Science and in part by the National Science Foundation under grants GJ-36393 and DCR7^-23679, and was submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Mathematics, 1975- 75^ ACKNOWLEDGMENT I would like to acknowledge the guidance and encouragement of my advisor, Professor Paul E. Saylor. His support, both financial and personal, made this thesis possible. I am grateful to the National Science Foundation for their support under grants NSF GJ-36393 and DCR7^-23b79 (NSF) . I would also like to thank Connie Slovak for her excellent typing and Mr. Stanley Zundo for preparing the drawings that appear in this thesis. Finally, I would like to acknowledge the support and encouragement of Mary Stoddard Manteuffel throughout my graduate studies. TAI ITS Chapter 1 . INTRODUCTION AND PRELIMINARIES 1 . 1 Introduction 1.2 Type of System 5 1.5 Type of Iteration 7 2 . TCHEBYCHEF ITERATION IN THE COMPLEX PLANE Ik 2.1 The Tchebychef Polynomials Ik 2.2 Optimal Properties of the Tchebychef Polynomials 21 2.3 Convergence of P UJ (\) 26 n 2 .k The Tchebychef Iteration 28 3 . CHOOSING OPTIMAL PARAMETERS 32 3 .1 The Mini -max Problem 52 3.2 Restrictions 3 -3 Real Arithmetic kl k . SOLVING THE MINI-MAX PROBLEM ^5 k . 1 The Alternative Theorem k$ h .2 Minimum Point of a Single Function. ' h r ( k-3 Pair-wise Best Point 6^+ k.k Three-way Point k . 5 The Algorithm 90 5 • ADAPTIVE PROCEDURE $k 5-1 Modified Power Method 9^ 5-2 Procedure and Example 101 Chap 6. ' NTATION AND RESULTS . 1. Implementation 6.2 Competing Algorithms I 6.3 Results 6 . k Summary LIST OF REFERENCES 1 1+5 APPENDIX A 1 ] APPENDIX B 169 APPENDIX C 171+ VITA I85 1. INTRODUCTION AND PRELIMINARIES 1.1 Introduction In applications such as reservoir problems, reactor studies, and numerical weather forecasting, one often encounters partial differential equation boundary value problems. A standard technique for solving these problems is to approximate the solution of the differential equation at a discrete set of points by the solution of a linear system, Ax = b. In general, such systems are large, sparse, and often nonsymmetric. In this thesis I will discuss an iterative method for solving large sparse nonsymmetric linear systems. In applications, systems arising from partial differential equations may be quite large, up to 1,000,000 unknowns, with as few as 5 to 10 nonzero entries per equation. The size and spar si ty of these systems motivate the use of an iterative method. Direct methods require a prohibitive amount of storage. The iteration to be discussed is a polynomial based gradient method (Rutishauser [7]) based on the scaled and translated Tchebychef polynomials . In the following I will show that when applied to systems whose spectrum lies in the right half complex plane, the Tchebychef iteration is optimal, in a certain sense, over all polynomial based gradient methods. Further, I will show how the optimal iteration parameters associated with the Tchebychef iteration can be computed from knowledge of the spectrum of the system. A drawback to most iterative methods is that the choice of optimal parameters depends upon prior knowledge of the eigenvalue structure of the system. In this thesis I will develop an adaptive procedure for determining the spectrum of the matrix during execution at a relatively low cost. Optimal iteration parameters can he computed dynamically with little prior knowledge of the spectrum. The adpative Tchebychef algorithm has the following properties: 1. Little a priori knowledge of the spectrum is required. 2. The method does not depend upon any special structure of the matrix A. 3. The method is sensitive to the condition of the T matrix A, rather than the matrix A A. h. The method requires only one matrix vector multiplication per iterative step. 5. The method can be used in conjunction with factorization methods . 6. The method may be used on singular systems. Few iterative methods have been developed to treat nonsymmetric systems. None share these properties with the adaptive Tchebychef algorithm. In a variety of test problems, the- Tchebychef method converged considerably faster than two competing methods. Some work on nonsymmetric systems has been done by Wrigley [25], Kjellberg [l6], Kincaid [15], and Faddeev [8]. This work is an extension of work done by Diamond [3] and Wrigley [25]. Many of the ideas for this thesis came from work done by Hestenes [12], Rutishauser [7], Stiefel [7], [12], [19], Golub [10], and Varga [10], [21]. The main ideas of this thesis are as follows: Chapter I presents the background of polynomial based gradient metho Section 1.2 describes the type of system to be considered. A crude region containing the spectrum of the system is determined. Section 1.3 outlines the general polynomial based gradient method and establishes criteria for choosing a sequence of polynomials upon which to base an iterative method. Chapter 2 deals with the scaled and translated Tchebychef polynomials in the complex plane. Section 2.1 describes the properties of the scaled and translated Tchebychef polynomials. It is shown that the asymptotic behavior of the Tchebychef polynomials is related to ellipses in the complex plane. Sections 2.2, 2. 3, and 2.h show that the scaled and translated Tchebychef polynomials fit the criteria established in Section 1.3 « The iteration based on the Tchebychef polynomials is described in Section 2.k where it is shown that the iteration parameters can be determined from the scaling and translating parameters c and d. The problem of choosing the parameters c and d is outlined in Chapter J. In Section 3.1 a measure of the rate of convergence is established at each eigenvalue as a function of the parameters c and d. A mini -max problem is constructed in terms of the eigenvalues and the parameters c and d whose solution represents the "best" choice of parameters c and d. The mini -max problem is refined and reformulated in Section 3.2. In Section 3-3 it is shown that for a real matrix A the iteration can be carried out in real arithmetic. Chapter k is devoted to finding the solution to the mini -max problem constructed in Chapter 3' Section k.l shows that the solution of the mini-max problem can be found in terms of the solution of a mini -max problem when the spectrum of the matrix A contains 1, 2, or 3 complex conjugate pairs of eigenvalues. The remainder of the chapter is devoted to solving the mini-max problem in these special cases. Section k.k contains the algorithm for solving the mini-max problem. Chapter 5 describes a method of estimating the eigenvalue structure of the matrix A during execution and a dynamic procedure for improving the choice of parameters c and d. Section 5-1 deals with an adaptation of the modified power method for simultaneous estimation of several eigenvalues based on the sequence of residuals (Wilkinson [2^]). Section 5-2 shows how each new eigenvalue estimate can be added to previous estimates to yield an improved choice of parameters c and d. Chapter 6 gives a discussion of the implementation of the algorithm followed by a discussion of competing methods and experimental results. It is shown that the adaptive Tchebychef algorithm requires fewer iterative steps to produce convergence and half as much work per step than two competing methods: Bidiagonalization (Golub and Kahan [11]) and the method of Conjugate Gradients (Hestenes and Stiefel [12]) T T applied to the equivalent system A Ax = A b. The appendix contains a listing of the algorithm, coded in FORTRAN . 1.2 Type of System The adaptive Tchebychef algorithm is restricted to system with eigenvalues in the right half (left half) complex plane. Such systems arise naturally from applications to partial differential equations, especially in conjunction with finite element methods (Zienkiewicz [27]) . In the remainder of this thesis the matrix A will be a real valued matrix with eigenvalues, X. , in the open right half plane. The possibility of zero eigenvalues will be discussed in Chapter 6. To establish bounds on the spectrum of such a matrix, let T T A+A A-A M = — 5 — and N = — - — . It is clear that M is symmetric, N is anti- symmetric, and A = M + N. Suppose \ is an eigenvalue of A corresponding to the eigenvector v = x + iy. Then, ~~ + ' Because M is symmetric, we have = = + i -i + = + . Since N is antisymmetric, = = 0, and we have = = + i - i + = 2i. Thus, we have Re(?0 3 <**,*> + } and |lm(\) 2 [ | |x|| 2 + ||y|| 2 < 2 N x |y| |x|| 2 + ||y|| 2 " < N Since M is symmetric, its spectrum is contained in some interval, [a,b], on the real line. Since N is antisymmetric, its spectrum lies along the imaginary axis and ||n|| is equal to the spectral radius of N. If \ is an eigenvalue of A, then Re(\) e [a b] , |lm(X) | < ||n|| . Using Gershgorin's circle theorem, bounds for the interval [a,b] and ||n|| can be found in terms of the elements of the matrices M and N (Varga [22]) giving a rectangle known to contain the spectrum of the operator A (see Figure 1.1 and Figure 1.2). Figure 1.1 Figure 1 . 2 Not at if small with respect to |b-a|, the rectangle is like that shown in Figure 1.1. If ||n!! is large with respect to |b-a|, then the rectangle is like that shown in Figure 1. ■ Both rectangles are more closely approximated by an ellipse than a circle. The importance of this observation will be shown later. If M is positive definite, then the spectrum of A lies the right half plane. In particular, if A is diagonally dominant with positive real diagonal elements, then M will be positive .1 ;i. . - This type of matrix is often encountered in application. Of r _ar interest are positive definite systems perturbed by a small nonsynmetric part, yielding a system with its spectrum contained in a region like that shown in Figure 1.1. It is this type of system for which the Pchebychef algorithm has the greatest advantage over competing methods . 1.; Iyp e *>f Iteration hyp;;; ; ;.:.- :e solve the system Ax = "b. If x_ is a:: initial guess at the solution, x, we car. .-er.pvte the initial res r Q = A(x-x^ = c -Ax. . and use this information to make a new guess, >:. . Again we ear. :smpute the residual r = A(x-x ) = b -Ax 1-1 Utilizing all of the information at hand we can use both r and r to 3 a new guess, x„ . Ihis leads to an iteration with general step 8 n x , = x + Z 7 . r. . n+1 n . 'm 1 where the y. ' s are constants. At each step we add a linear combination of the previous residuals. Let e = x-x n be the error at the n th step. From the general step we can write n n e -, = e - Z 7 . r. = e - A 2 y . e. . n+1 n . , 'm i n . _ 'm i i=l i=l Lemma 1.1 The error at the n th step is given by e = P (A)e n , n n ' where P (z) is a polynomial of degree n such that P „(0) = 1. Proof The proof will proceed by induction. For n = 1 we have e i = e o " ?oo Ae o = (I " 7 oo A)e o ' so that we may define P x (z) = (l-7 00 z) , and we have P (0) = 1. Assume the conclusion is true for i < n; then, n e -,=e -A y y . e. n+1 n * 'm i i=l n-1 = e - y Ae-AZy.e. n 'nn n ._ 'm 1 = [(1-7 A)P (A) - A ^ 7 • ?. (A)l e . nn n . ., ni - 1 i=l n-1 P _(z) = (1 -7 z)P (z) - z L 7 . P.(z) . n+l v nn n . , m i 1=1 Then, P (z) is a polynomial of degree n+1 and P (0) = 1. This proves the lemma. Any sequence of polynomials, {P (z)}, can be generated by choosing the constants, [7. .}. We would like to choose the sequence J- J of polynomials so that He n l|<||P n (A)|| ||e H is small. Let us examine P (A). If A is diagonalizable, then the Jordan form of A is a diagonal matrix (Birkhoff and MacLane [l]). There exists a nonsingular S such that A = S~ JS and \ Now J = P (A) = P (S _1 JS) = S' 1 ? (J)S , and since j is a diagonal matrix we have P n (j) This yields the following result: Theorem 1.2 If A is diagonalizable, then ||P (A) || -» as n -♦ co if and only if P (\.) -* as n -* 00 for every eigenvalue, \. , of A. n 1 ' 1* 10 Proof The proof is clear from the discussion above. Suppose A is not diagonalizable; that is, suppose A has non- linear elementary divisors. The Jordan form of A has nontrivial Jordan blocks . We have A = S~ JS where J i j J k and X. 1 1 J. = l is the Jordan block associated with the eigenvalue X. of multiplicity d. . In this case i P (A) = P (S _1 JS) = S _1 P (J)S , n n n ' and P (J) = n \ P (JJ n k Taking successive powers of J. we see that 2 \. Zk. 1 i l j 2 - J?- 1 IK and in general, cX" 1 (>r 2 ( s )x s-d i+ l d i x v 2 ; l • (X s - 1 l s . 8-1 s(s-l) , s-2 s(s-l) . . . (s-r+1) . s-d-;+l s(s-l) % s-2 2T~\ s . s-1 12 The superdiagonal terms act like derivatives. Since P ( J. ) n 1 is a linear combination of powers of J., we have 'p (x.) p«(x.) 5tP"(\.) . . . /. 1 lU p ri'^oo n \ x ; n v i ; 2! n v i ; (d.-l)! n v i P (J.)= n v i 7 This yields the following theorem: Theorem 1 .3 If V- is a ^ eigenvalue of A with multiplicity d., then |jP n (A) || -> as n - oo if an< i only if P^ (\. ) -* as n -* oo for every j < d., for each eigenvalue A.. . Proof The proof is clear from the discussion above. When looking for a sequence of polynomials to suppress the eigenvalues of A, we must find one whose derivatives also suppress the eigenvalues of A. In light of the previous discussion, we can establish three criteria upon which to choose a sequence of polynomials: 1. We must choose P (z) , among polynomials of like degree such that P (0) = 1, to be "as small as possible" on the spectrum of A. 2. If A has nonlinear elementary divisers, then we must choose P n (z) so that its derivatives are small on the spectrum of A. }• We must choose P (z) to have some recursive properties so that all of the previous residuals need not be stored. In the next chapter we will see that the scaled and translated Tchebychef polynomials fit these criteria. Ik 2. TCHEBYCHEF ITERATION IN THE COMPLEX PLANE The Tchebychef polynomials were discovered a century ago by the Russian mathematician Tchebychef (the spelling of which has many variations) . Their importance for practical computation, however, was rediscovered about thirty years ago by C. Lanczos. Since then they have found many uses in numerical analysis (Fox [9l)« The definition and basic properties of the Tchebychef polynomials in the complex plane will be discussed in Section 2.1. Sections 2.2, 2.3, and 2.k will show how the Tchebychef polynomials meet the criteria established in Section 1.3« The gradient method based on the Tchebychef polynomials is developed in Section 2.k. 2.1 The Tchebychef Polynomials The Tchebychef polynomials are given by: T (z) = 1, T x (z) = z, T n+1 (z) = 2zT n (z) - T n-1 (z), n > 1 They may also be written: T (z) = cosh(n cosh" (z)), where the branch of cosh" with positive real part is used. L5 Consider the map t] = cosh(z) . Let z = x t iy, T] u t Lv. Then cosh(z) = cosh(x+iy) = u + iv = r\ . U "ing the expansion Cormula for the cosh, we have cosh(x+iy) = cosh(x) cos(y) + i sinh(x) sin(y), or u = cosh(x) cos(y), v = sinh(x) sin(y) . Suppose we fix x > and allow y to vary. Then u and v satisfy 2 2 u v + -z = 1 • 2 2 cosh (x) sinh (x) That is, the line x = constant is mapped onto an ellipse with semi- major axis |cosh(x)|, semi-minor axis |sinh(x)|, and foci at 1 and -1 (see Figure 2.1). This map has period 2rti. Since cosh(x) and sinh(x) are increasing for x > 0, if < a < b, then the line x = a is mapped onto an ellipse inside and confocal to the ellipse that the line x = b is mapped onto (see Figure 2.1). If x = we have u = cos(y), v = , and the ellipse has collapsed onto the real line segment [-1, 1] . Because of periodicity, the map r\ = cosh(z) takes the region shown in Figure 2.2 onto the entire t}- plane. Each vertical line in this region is mapped onto an ellipse in the t} -plane. This region is the branch of cosh" used in the definition of the Tchebychef polynomials. The function cosh"^ may also be written in log form: 1 cosh" (w) = ^ (w + (w -l) ) . 16 COSH(b) x=a x=b Z - PLANE COShfV/j) COSH(z) rj -PLANE Figure 2 . 1 2 7TI -- 7TI Figure 2.2 re must be taken when choosing the branch of the square root. The branch chosen depends on the argument w and should be cho i that (w ) tl = w. The n th Tchebychef polynomial, T (z) = cosh(n cosh" (z)), maps an ellipse onto a vertical line segment in the region shown in Figure 2.2, multiplies this line segment by n, and maps the new line segment back onto another ellipse (see Figure 2.1). Since the new line segment cuts through n branches of cosh , it is wrapped around the new ellipse n times. The degenerate ellipse, the line segment [-1, 1], is mapped onto the line segment [0, in], multiplied by n, and mapped back onto the line segment [-1, l] . Since it is wrapped around n times, T n (z) has n zeros on the line segment [-1, l] . To establish some notation, let ^(d, c) be the family of ellipses centered at d with foci at d+c and d-c. Let F(d, c) e F. (0,1) = F.(0,l). ■*■ J J Proof The proof follows from the discussion above. Now consider the scaled Tchebychef polynomials, T (z) n v These polynomials exhibit an asymptotic behavior 18 T n (z) Lemma 2.2 If C (z) = -^—, r , z_ I [-1, l], then for large n n v 0' C (z) = e n ^ cosh ~ ^ - cosh " ( z o)l n^ Proof From the definition of the cosh, we have n cosh" (z) -n cosh" (z) C n (z) = " 1 — 1 g n cosh (z Q ) +e -n cosh (z Q ) Since the branch of cosh -1 with positive real part is used the result follows . For large n, C (z) takes on the form r , which motivates the following definition. Definition Let 1 r(z) = lim |C n (z) n | Z-+ °° be the asymptotic convergence factor of C (z) at the point z. The asymptotic convergence factor, which will be referred to as the convergence factor, is related to the rate of convergence as defined by Young [26] and Varga [22] in that rate of convergence equals - &n(r(z)) . From the lemma above we have r(z) = |e COsh ~ ^ " cosh " ( z o"> I = e Re(cosh" (z)) -Re(cosh" (z Q )) There is a relationship between the convergence factor and the members of #(0,1), the family of ellipses with foci at 1 and -1. Lemma 2.3 If z_ eF_(0,l), z. eF. (0,1), z.eF.(0,l), then V ' " l i ' ■ • J 3 r(z.) < r( Zj .) & F.(0,1) c F^(0,1), r(z 1 ) = r(z.) <> P 1 (i ,!) K.(0,1), r(z) = 1 O z e F Q (0,1) . t^ - o- t \ Re (cosh" {?,)) -Re (cosh" (z n )) .. .. _ . . Proof Since r(z)=e v " ^-U", the result follows from the previous lemmas. T n (z) Theorem 2.U Let C Q (z) = T ^ y, z Q / [-1, 1] . If F Q (0,1) is the member of the family 9^(0,1) passing through z , then f if z is inside F_(0,l) lim C n (z) = ° n-»oo l^ co if z is outside F (0, 1) . Proof From Lemma 2.3 we see that r(z) < 1 if z is inside F (0, 1) and r(z) > 1 if z is outside F (0,1). Since |c (z) | = r(z) n the result follows. Consider the transformation z = - = —, z^ = — , where d and c c ' c are any complex numbers. Let T &h) P (X) = C (z) = n g . n n T (-) n c The polynomials P (\) are called the scaled and translated Tchebychef polynomials. Notice that P (0) = 1 as is required for use in a gradient method. The choice of d and c determines a family of ellipses, <£*(d, c), with foci at d+c and d-c (see Figure 2.3). Let F„(d,c) e *^(d,c) be the member of the family passing through the origin. The transformation maps members of ^(d, c) in the \-plane onto members of ^(0,1) in the z-plane . 20 F (^ c ) F o (0J) X-PLANE Z -PLANE Figure 2-3 As before, let r(\) = lim |P (\) n-*oo be the asymptotic convergence factor of P (\) at the point \. We have from above ,.v Re(cosh' 1 (— )) -ReCcbsh" 1 ^)) r(\) = e v v c c The relationship between the members of *&(&, c) and the convergence properties of the polynomials P (\) is given in Theorem 2.5 t {£±) Theorem 2.^ If P (X) = — — 2 — , then n T (S) n v c . ! 1. f if X ia inside F (d, c) Li* V (\) = n- ^ * if X is outside F (d,c) If X. eF.(d,c), X. €F.(d,c), then l i v ' ' j j 2. r(\.) < r(X ) <> F i (d,c) c Fj(d,c), r(\ i ) = r(Xj) o F.(d,c) = Pj(d,c), r(X) =1 <=> X € F Q (d,c) . Proof Since the transformation maps members of ^(d, c) in the X-plane onto members of ^(0,1) in the z-plane, the result follows from Theorem 2.k and Lemma 2.3 • 2.2 Optimal Properties of the Tchebychef Polynomials The first criterion mentioned in Section 1.3 suggests that when choosing a sequence of polynomials upon which to base an iterative method, it is desirable to choose polynomials that are small on the spectrum of the matrix A. Since the spectrum of A is seldom known, it is more practical to choose polynomials that are small on a region containing the spectrum of A. If the region is bounded by a circle or an ellipse, the scaled and translated Tchebychef polynomials have certain optimal properties. Much is known of the optimal properties of the monic scaled and translated Tchebychef polynomials over all monic polynomials of like degree on regions bounded by ellipses (Hi lie [13], Walsh [23]). Similar results, but not as strong, can be shown for polynomials normalized at the origin . Definition Let S = {all polynomials, s (X), of degree n such that s (0) = 1} . The elements of S are said to be normalized at the origin. n v ' ' n — 22 Theorem 2.6 Let E be a closed and "bounded infinite set in the complex plane. There exists a unique t eS such that max |t (z) | = min max |s (z) | . zeE s eS Z€E n n Proof Omitted (Hi lie [13])- If the region E is bounded by an ellipse, a circle being a special case of an ellipse, then the maximum will occur on the boundary. Using the notation of Section 2.1, let F (d, c) be the member of the family ^(d, c) with semi -major axis a < 0. Instead of taking the maximum over the entire region we may take the maximum over the boundary, F (d, c). The circle with center d and radius a is denoted F (d, 0). If the spectrum of A is contained in a region that is bounded by a circle that does not include the origin in its interior, we have the following result. Theorem 2.7 Suppose F (d, 0) does not include the origin in its interior. Let a < |d|. The unique polynomial t n e S n such that max |t (X) j = min max |s (X) XeF (d,0) n s eS XeF (d,0) n a ' n n a is given by y*> - (^) n Proof Suppose q eS and max |q (\) | < max |t (X) | = (-4T) n 7 (an) ^ x?v (c\.o\ n \l a l/ \€F a (d,0) ^F a (d,0) Then, |q(\)| < |t(X)| - J^j n , for every X eF (d, 0) . By Rouch^'s theorem the polynomial t (X) - a(X) has the same number of zeros inside F (d, 0) as the polynomial 8 t (\) does. Notice that t (0) -a (0) = 0. Since X = is not in the interior of F (d, 0), t (X) -q (X) is a polynomial of degree n with n+1 roots. We can conclude that t (\) = q (X) , and the theorem is proved. If the region is bounded by an ellipse with real foci that does not contain the origin in its interior, we have the following result due to Clayton (Wrigley [25]). Theorem 2.8 Let < c < a < d. The unique polynomial t eS such that max |t (X) | = min max |s (X) \eF (d,c) n s eS XeF (d.c) n a n n a v ' ' is given by T (^) t (X) = P (\) = n ° , n n T (1) n^c the associated scaled and translated Tchebychef polynomial. Proof emitted (Wrigley [25 ] ) . This result cannot be extended to d and c with complex values (an easy example can be constructed), but it can be shown to be asymptotically true. For large n, P (X) tends very quickly to the optimal polynomial in S . 2k Lemma 2.9 Suppose F (d, c) does not contain the origin in its interior. ■ 3. Let t e S be the unique polynomial such that max |t (X) | = min max |s (\) | ; XeF Q (d,c) n seS^ XeF o (d,c) n n n a then, min |P(>0|< max |t (X) | < max |p(X) | XeF (d,c) XeF (d,c) n XeF (d,c) a. a d Proof The second inequality is true by hypothesis. Suppose that max |t (X) I < min |P (X) |; XeF a (d,c) n XeF a (d,c) then, t (X) < P (X) n v n v ' for every X eF a (d,c). By Rouche's theorem, the polynomial P (X) -t (X) a n n has the same number of zeros in the interior of F a (d, c) as P n (X) does. P (X) has n zeros on the line segment joining the foci, d+c and d-c. Notice that P (0) -t (0) = 0. Since X = is not in the interior of F (d, c), P (X) -t (X) is a polynomial of degree n with n+1 zeros. We a n n can conclude that P (X) = t (X), and the lemma is proved. Theorem 2.10 Suppose F (d, c) does not include the origin in its interior. Let t e S be the unique polynomial such that max (t (X)) = min max (s (X)) . XeF (d,c) n s € S XeF (d, c) n a v ' n n a v ' 25 Let M(s ) ■ max l s n (^ I ; ^€F.(d,c) then, lim [M(t n ) n ] = lim [M(P n ) n ] n-» oo n-+oo Proof Let m(P ) = min |p (\) |. n \€F a (d,c) n From Lemma 2.9 we know that m(P ) < M(t ) < M(P ). It is sufficient n — n — n to show that 1 1 lim [m(P ) n ] = lim [M(P ) n ] . n n By Theorem 2.5 all points on the ellipse F (d, c) have the same convergence factor; thus, we have 1 1 r(\) = lim [m(P n ) n ] = lim [M(P n ) n ] n.-+oo n - * °o for every X eF (d, c). This proves the theorem, a Because of the nature of the cosh function, the asymptotic convergence factor is achieved very quickly; thus, the scaled and translated Tchebychef polynomials tend very quickly to the optimal polynomial in S . As the focal length c approaches the ellipse F (d, c) is deformed into the circle F (d, 0). To show that the result for circles is 3. compatible with the result for ellipses we have the following lemma: 26 Lemma 2.11 If P (\) = — — § — , then n T £)■ n c l im P ( X ) = (^) n c->0 Proof By the definition of cosh, we have T (±±) n cosh" 1 ^) -n cosh" 1 ^) P (x) = n ? = * c + e L- n T (-) -lA -lA n v c 7 n cosh ("cJ -n cosh {■$) which tends to -1 — n cosh ( c > n cosh" (77) -1 -1 2 2 as c decreases. By the log form of cosh (cosh" (wl = &?(w + (w -1) )) this is c c d/c + [(d/c) 2 - 1] 1/2 (d-\) + [(a-\f - c 2 ] l/2 d + [d - c J ' -d-\ which tends to (— jr-) as c decreases. This proves the lemma, 2-3 Convergence of P^(\) t> n Recall from Section 1.3 that if the matrix A has nonlinear elementary divisors, the derivatives of the sequence of polynomials must also converge to zero on the eigenvalues of A. In this section it will be shown that each derivative sequence of P (\) has the same region of convergence as the sequence P (\) does. 27 T ( ) Theorem 2.12 Let P (\) ■ — — ^ — . [f \ is inside V (d, c), bhe D T (-) n c member of ^(d, c) passing through the origin, then lim P^(\) = , for every j . Proof Suppose \ is inside F (d, c). Since the interior of F„(d, c) is an open set, there is some 5 > such that T = {t/|t-\| = 8} is inside F n (d, c) . We have v^(\) = -^- f ?n dt r Let |P (\ ) | = max |p (\) |; then, n n \€T n t -\ .. r p (t) <31 'vv r dt |P (X ) ., ' n n (5)' Since r is in the interior of F n (d, c) we know that |P (\ ) I = r n 1 n v n y ' for some r < 1, and the theorem is proved. 28 The speed of convergence of |p (X ) | depends upon how close T is to the boundary, F (d, c). There is a trade off between choosing smaller 5 and the speed at which |p (X ) | converges. One can, in fact, pick a new 5 for each n and get IP^OOI , n+l v n v ' n-l v " then for n > we have ViM ■ C n G n_1 "n+1 c t A-) n+l v c ; "n-1 c T £) n v c \T(£±) 2 | T (^) T - (^) 2 n c c n c n-1 c c T x '(5) T A~) n+l v c' n+l v cr n+1 v c ' T (-) " 2 nV \ P (X) + n "2 * T (*r c n c P 00 - n n-l x c T , (-) . n+l v c y J T (-) n+lV _ T (-) _ n+lV. p n oo n-1 We have for n > P 00 - P . 00 n ' n+1 "0 T n (|) d n c X P (x) + n " 2 It (*) ' 1 c n c T ,£) _ n+1 v c _, P 00 n C T . (S) n+1 v c ' + "t _(V n-1 cr ^_ n+l v c y _ p n 00 n-1 2 nV ° T A n+lV_ X p 00 + n n-lV _ T n + iy_ (P n- a 00 - 30 We can write Dx = (P (A) - P _ (A)) e„ n v n v n+1 K J ' "t (*) n v c T A±) n+l^cr A P n (A) e Q + T _(*) n-1 c n+lV (P . (A) - P (A)) e, v n-1 n v '' ( Notice that and that r = A e = AP (A) e_ , n n n ' Dx _ = (P '(A)- - P (A)) e^ n-1 v n-1 n We have then Dx = n 2 c f T (") 1 n c r + n n-1 c n+1 v c ' Dx . , n-1 ' T ,A ^ n+1 c _ for n < 0. For n = we have Since we have Dx Q = (P Q (A) - P 1 (A)) e Q . P Q (\) - P x (\) = 1 -• (1 - |) = \, n 1 a 1 ux o ~ I o " d r o ' The three term iteration becomes: given initial guess x_ and parameters d and c, let r o = b - A x o ' Dx o = I r o > x i - x o + ^o • For n > 0, let r = b - A x , n n ' 2 nV n-lV n Dx = y- r + 3— Dx . , n c ( d } n A) n - X n+1 c n+1 c x , = x + Dx n+1 n n The coefficients can be recursively generated and the iteration can be carried out with only x., r., Dx. in storage. i 7 l l If the spectrum of the matrix A lies in the right half plane, then it can be enclosed in an ellipse that does not contain the origin in its interior. The associated scaled and translated Tchebychef polynomials meet the criteria established in Section 1.3. They have minimal maximum modulous properties on ellipses, their derivative sequences also converge, and the iteration can be carried out by a three term recursion . The remainder of this thesis will be devoted to implementing an iteration based upon the Tchebychef polynomials. 3- CHOOSING OPTIMAL PARAMETERS The spectrum of the matrix A can be enclosed in many different ellipses. In fact, given any family of ellipses, ^(d, c), there is some member of the family that contains the spectrum of A in its interior. If the spectrum of A lies on the interior of F„(d, c), the member of the family *^(d, c) passing through the origin, then the iteration based on the associated scaled and translated Tchebychef polynomials will converge. We would like to choose ^(d, c) so that this convergence is optimal in some sense. In Section 5-1 it will be shown that the choice of parameters which yield the best rate of convergence can be found as the solution of a mini -max problem. The mini -max problem will be restricted and restated in Section 5-2. In Section 3-3 it will be shown that if the matrix is real, the iteration can be carried out in real arithmetic. 3«1 The Mini -max Problem Suppose d and c have been chosen. Let F (d, c) be the smallest member of the family f(d, c) containing the spectrum of A in the closure of its interior. There is some eigenvalue, say \ , such that \ g eF (d, c) From Section 2.1 we know that F (d, c) is associated with a convergence factor, r(\ o ), and that r(X ) = max r(\ ) , S >^i X because all of fcfae other >'i; ftwalues are inside or on (4»c) Recall that r(\) is also a function of d and c. We have r(M - |e [cosh" (— — ) -cosh" (-) 1 v c c If we use the log form of the cosh"' , then r(\) = ,d-\. ,a-\, (§) + ((f) 2 - l) 5 (d-\) + ((d-xf - c 2 ) 2 1 d + (d 2 - c 2 )2 One way to optimize the choice of d and c is to make r(\ ) as small as possible. The parameters d and c will then satisfy min max r(\.) = min max d, c \^ d, c Xj_ 1 (d-\.) + ((d-*..) 2 - c 2 ) 2 " d + (d 2 - c 2 )2 A more rigorous argument which yields this same mini -max problem is as follows. With a polynomial based gradient method the error is suppressed in accordance with the equation e = P (A) e . The following definition of rate of convergence is used by Young [26]. Definition The rate of convergence of a polynomial based gradient method applied to the system Ax = b is R(A) = -H (lim(!|P n (A)|| n )) ii-k» 7 A We would like to choose d and c to make R(A) as large as 1 possible or, equivalently, to make lim (|JP (A) ]| n ) as small as possible n-»oo n Let [cosh" (-^-) - cosh" (-)] M(\) = e c c . If we use the log form of cosh" this becomes 1 2 2 N 2 m(M = ^ + ^-^ -^ > . d - c )^ From Lemma 2.2 we have ,n P n (A.) = (M(\)) J for large n. Since M(\) is analytic in an open set containing the spectrum of A, there exists an operator M(A) . The eigenvalues of M(A) are M(\j_) where \^ is an eigenvalue of A (Dunford and Schwartz [^]). We have P n (A) = (M(A)) n for large n, so that 1 1 lim (||P (A)|| n ) = lim (||M(A) n || n ) n n-> co ij - n - * 00 = spectral radius of M(A) The spectral radius of M(A) is max |m(X.)| = max r(\.) The choice of d and c which yields the optimal rate convergence is the solution to the mini -max problem above. :;ince r(\) is a function of d and c as well as \, let us write the mini -max problem as min max r(\. , d, c) . d,c \ ± 3-2 Restrictions In this section we will show that if A is a real valued matrix, the mini -max problem can be restricted so that the maximum is taken over a subset of the eigenvalues and the minimum is taken over 2 d and c such that d and c are real. In Section 3.1 we defined the ellipse F (d, c) to be the s smallest member of the family 7"(cL, c) enclosing the spectrum of A. Since F s (d, c) is convex, the eigenvalue \ eF (d, c) has the property that it is on the boundary of the convex hull of the spectrum. In fact, it is a vertex of the smallest convex polygon enclosing the spectrum. Definition Let H = (\. |\. is a vertex of the smallest convex polygon enclosing the spectrum of A} . We will refer to H as the hull of the spectrum. The elements of H completely determine the mini-max problem. Lemma 3 • 1 For any d, c max r(X.,d, c) = max r(\.,d, c) \ ± ' X \ ± e H ' X Proof Suppose there is a d and c and X, £H such that max r(X i ,d,c) = r(^ k ,d,c) . *-i 36 Since X /h, we know that X is not a vertex of the smallest convex polygon, P, enclosing the spectrum. Let F (d, c) be the member of 0-(d, c) passing through X . Since F (d, c) is an ellipse and passes through a point that is either in the interior of P or on P between vertices, there is some vertex of P outside F, (d, c) . From the definition of P, this vertex must be an eigenvalue, say \.. By Theorem 2.5 J r(X ,d, c) > r(X k ,d, c) . This contradiction proves the lemma. If A is a real valued matrix, then the eigenvalues of A are real or appear in complex conjugate pairs. The hull, H, of the spectrum is symmetric with respect to the real line. This motivates the following theorem, the proof of which is omitted. Theorem j .2 If A is a real valued matrix and d and c satisfy the mini- max problem, min max r(>v..,d, c) , d,c X ± ett X then the family ^(d, c) is symmetric with respect to the real axis. Proof Omitted. If A is real v;e may restrict our search to those d and c which correspond to families of ellipses which are symmetric with respect to the real line. Such a family has foci that are either both real or are a complex conjugate pair. Since the foci are d+c and d-c, then d 2 is real and c is either real or pure imaginary. In either case c is real. Notice in the log form of the definition of r(\,d,c) in Section 3.1, that c appears only as c . Let c2 = c , and let r(\,d,c2) = M-O i ((d-\)' - c2) d + (d - c2)' Since the families ^(d, c) and <^(d, -c) have the same foci, d+c and d-c, the parameters d and c2 uniquely determine the family ^(d, c) . For A real we may restrict d and c2 to real values. In addition we may ignore those values of d and c2 for which convergence clearly does not occur. o Definition Let R = {(d,c2)/0 < d, c2 < d^"} . Corollary 3»3 If A is a real valued matrix with eigenvalues in the right half plane the mini-max problem can be written min max r(\.,d, c2) . (d,c2)eR \.eH X Proof It is clear from the discussion above that d and c2 may be restricted to the real numbers. By hypothesis, A has eigenvalues in the right half plane. If d < 0, then every eigenvalue of A would be outside F n (d, c), the ellipse passing through the origin. Convergence p could not occur for this choice of d. If d > and c2 > d , then c > d and d-c < < d+c. The family ^(d, c) has one foci on each side of the origin. The ellipse F (d, c) is the degenerate ellipse. Since F (d, c) has no interior, there is no region of convergence. Because A has its eigenvalues in the right half plane, there is some d and c2 for which convergence will occur.. The solution of the mini-max problem is in the set R, and the lemma is proved. Notice that for (d, c2) e R we have r(\,d,c2) = r(\,d,c2) . 38 The maximum is completely determined by the eigenvalues with nonnegative imaginary part. Definition Let H+ = {A..eH/lm(A) > 0] . We will refer to H + as the positive hull of the spectrum. Corollary 3 -^ If A is a real valued matrix with eigenvalues in the right half plane, then the mini -max problem can be written min max r(A..,d, c2) . (d, c2)eR X i eII + 1 Proof The proof is clear from the discussion above. A further reduction of the set H + is possible. We would like to find the smallest set of eigenvalues that completely determines max r(A..,d, c2) when (d, c2) eR. A-i x Definition Let K = [\. ell + /there exists (d, c2)eR such that r(A..,d, c2) = max r(\. ,d, c2) J . The elements of K will be known as key elements. X± x Clearly, if (d, c2) eR, then max r(A..,d, c2) = max r(\.,d, c2) . X.eK X X. €H^" 1 l i Criteria to determine when an eigenvalue is in the set K are needed. Lemma j . 5 If A., cK, then one of the following is true: 1. Re (A., ) < Re (A..) for every \. eH + 2. Re (A., ) > Re (A..) for every A.. eH + K — 1 1 3« There exist A.^, A. eH + such that there is an ellipse, F, (d, c), with (d, c2) e R, passing through A.,, A. , and X , containing the spectrum of A in the closure of its interior. mot' KV . (His associated with a family o V l, c). Let Fw(d, c) be the member of 9^(d,c) passing through >^. As (d, c2) is moved through the rty.ion R, P' (d, c) is continuou deformed. If X, eK then there is some point (d , c2 ) eR such that r(\ k ,d 1 ,c2 1 ) = max r(\ i ,d 1 ,c2 1 ) . X i Since r(\.,d , c2 ) is maximal we know from Theorem 2.5 that F, (cL, c ) contains the spectrum in the closure of its interior. Suppose X is the only eigenvalue on the ellipse F,(d , c,) . If X does not satisfy 1. or 2. of the hypothesis, then there are at least two other eigenvalues in H + , say X. and X , such that Re (\ ) < Re (X^) < Re (Xj . Consider deforming the ellipse F^(d , c ) into the degenerate ellipse, the line segment connecting X-^ and X-^, by moving (d, c2) through R (see Figure 3.1). The eigenvalues X. and X are inside F k (d , c, ) but outside the degenerate ellipse. One of the intermediate ellipses must have passed through another eigenvalue. As (d, c2) moves from (d n ,c2, ) let fd ,c2 ) be the first point such that the ellipse, F, (d , c ) passes through another eigenvalue, say X . Since it was the first, F, (d , c ) still encloses the spectrum, and, from Corollary ~5-h, Suppose X, and X are the only eigenvalues on the ellipse F^(d , c p ). We can move (d, c2) through R in such a way that F-^(d, c) passes through \y anc ^ *-p • As c ^ gets negatively large the foci of the \ 1*0 X k ^k^ c i> i / F k (d 2» C 2 ) Figure 3-1 Figure 3-2 ellipse F (d, c) have large imaginary part and the ellipse is deformed into the infinite column between Re(\) and Re(\») (see. Figure 3-2). If Re(\, ) < Re(X.«) then one of the intermediate ellipses must have passed through \.. If Re(\ ) > ~Re(X ), then one of the intermediate ellipses must have passed through X . In either case, as (d, c2) moves from (d^, c2g) let F ' (d,,c ; , ) be the first ellipse to pass through a third eigenvalue, say X . Since it is the first, F, (d-,,c^) still encloses the spectrum, and, from Corollary $.k, X gH . This proves the lemma. The lemma provides criteria by which certain elements of the set H + can be ignored. The implementation of these criteria will arise naturally from the algorithm to be presented at the end of Chapter k. The results of this section are summed up in the following theorem. J. I Theorem 5»6 IT A is a real valuml matrix with eigenvalues in the right half plane, then the parameters d and c which yield the optimal o rate of convergence can be found in terms of d and c2 = c as the solution of the mini -max problem min max r(\.,d, c2) (d,c2)eR \.eK 3-3 Real Arithmetic In this section it will be shown that if d and c2 are real, the iteration based upon the associated scaled and translated Tchebychef polynomials can be performed in real arithmetic, even when the scaling parameter c is pure imaginary. Theorem 3-7 If d and c2 are real, then T (^) p. 00 = T (*) n^c' is a polynomial in X with real coefficients. Proof Since c2 = c , if c2 > 0, then d and c are both real, and 7 (\) has real coefficients. If c2 < 0, then c = is where s is real. We have T £*) T (t-) n is If n is even, then T (z) is an even polynomial. Each term raises (— r— ) to an even power, so T„(— r— ) has real coefficients and v is r ' n v is T„(t— ) is real. k2 If n is odd, then T (z) is an odd polynomial. Each term ,d-\ raises (— r— ) to an odd power, so i can be factored out of each term. is a % . Likewise, T (t— ) is pure imaginary. The quotient, P (M, has real coefficients, and the theorem is proved. From Section 2.5, the iteration based upon the scaled and translated Tchebychef polynomials is as follows: to solve Ax = b, given x Q , d, c, let r Q = b - Ax Q Dx = a r o X l = x o + Dx o • For n > 1, let r = b - Ax Dx = - n n+1 2 nV ° T _ (*) n+1 c _ r + n "t A-)" n-1 c y T A-) n+1 c _ x + Dx n n Dx n-1 The iteration parameters are given in terms of d, c and T (— ) . If c is pure imaginary the computation seems to require complex arithmetic . The parameters can be generated recursively, however, in terms of d and c2. Theorem 3-8 If d and c2 are real then the iteration based upon the associated scaled and translated Tchebychef polynomials can be performed in real arithmetic. Proof For n > 1 let Pl(n) = % T (-) 2 nV T , (i) ' n+1 c For n - 1 we have T A * Pl(l) = -141 = £ _£ £d_ T (-) C 2(-) -1 2 1 Pl(n) = ^ n c c nS c _ ,d N d m /d N T ,, (-) T , (-) n+l v c ; n+1 c 1,(J) ' n+l v c T ,(-) T n (-) P2(n) = -iH n " 1 C = -r(l + P2(n)) , T '(*) 2 * T (*) . T (1) n+l v c y c n v c y n-1 c n-l v c y T (-) 2 n-lV T $ n x c' C T (*) n c T (-) n d n-lV " c t (i) n c y 4 T (-) d 2 n-lV 1. The iteration can be performed in real arithmetic, which proves the theorem. h. SOLVING THE MINI-MAX PROBLEM If the eigenvalues in the positive hull, H + , are known, then, as was shown in Chapter 3, the optimal parameters can be found as the point that minimizes the maximum of a finite number of real valued functions of two real variables. Consider each function, r(A..,d,c2), to be a surface above the d, c2-plane. Section k.l will show that the mini -max solution is either a local minimum of one of the surfaces, a local minimum along the intersection of two surfaces, or a point where three surfaces intersect. Section k .2 will show that each surface has only one local minimum in R. It will be found explicitly. The local minimum along the intersection of two surfaces will be found in Section k .J> as the root of a fifth degree polynomial. In Section k.k the unique intersection of three surfaces will be found, when it exists, and existence criteria will be established. An algorithm to find the mini -max solution among the possible candidates will be developed in Section 4.5- It will be assumed that [\.) are eigenvalues of A and that Re(\.) > 0, ^(X.) > 0. k.l The Alternative Theorem The following theorem will enable us to solve the mini -max problem by looking at three or fewer functions at a time. Theorem k.l (Alternative Theorem) If (f.(x,y)} is a finite set of real valued functions of two real variables, each of which is continuous on a closed and bounded region S and k6 M(x,y) =' max t ± (x,y) , then M(x, y) takes on a minimum at some point (x ,y ) in the region S. If (x n , y n ) is in the interior of S, then one of the following holds: 1. The point (x ,y ) is a local minim-urn of f. (x,y) for some i such that M(x Q ,y ) = f. (x ,y ). 2. The point (x , y n ) is a local minimum among the locus ( (x,y)eS/f . (x,y) = f.(x,y)} for some i and j such that M(x Q ,y ) = f^x^y^ = f (x ,y Q ) . 3- The point (x , y ) is such that for some i, j, and . k, M(x ,y Q ) = f.(x ,y ) = f.(x ,y ) = f k (x ,y ) . Proof Since M(x, y) is continuous on S, it takes on its minimum at some point (x ,y„) in S. In the following all neighborhoods of (x ,y ) will he understood to be in the interior of S. If there is a neighborhood of (x„,y n ) such that M(x,y) = f . (x,y) in the neighborhood, then (x„, y n ) is a local minimum of f. (x, y) and M(x ,y ) = t^yj. Suppose that there is no neighborhood of (x , y ) in which the maximum, M(x,y), is determined by only one function but there is a neighborhood in which M(x, y) is determined by two functions, say f. (x, y) and f.(x,y). Since f . (x,y) and f.(x,y) are continuous, then M(x , Yq) = £ ± (x ,y ) = f (x Q ,y ). Clearly, (x Q ,y ) is a local minimum in the locus t(x,y)eS/f . (x,y) = f (x,y)}. -*- J Suppose that in every neighborhood of (x n ,y^) the maximum, M(x,y), is determined by at least three functions. Since they are all continuous, there are three functions, say f.(x,y), f.(x,y), and f, (x,y), 1*7 such that M(x ,y ) = ,y Q ) = r.(x Q ,y ) . f k < ... |. Biifl prov the theorem. Since the spectrum of A lies in the half plane, it can be enclosed in an ellipse symmetric with respect to the real line that does not include the origin. Equivalently, there is some point in the region R such that max r(\. , d, c2) < 1 . H ' ' In the next section it will be shown that for each \., r(\.,d, c2) > 1 i' 1 — on the boundary of R. There is, therefore, a closed and bounded subregion S c R which contains the mini -max solution in its interior, and we can apply the Alternative Theorem to the mini-max problem. k.2 Minimum Point of a Single Function In this section the one local minimum of the function r(\.,d, c2) in the region R will be found in terms of X. . Each point (d, c2) eR is associated with a family of ellipses in the \-plane whose members are the level lines of r(\, d, c2). Let F. (d, c) be the ellipse in this family passing through X. . From Theorem 2.5 we know that r(\, d, c2) takes on the same value at each \ eFj_(d, c). In particular, consider \ in Figure k.l. Since (d-X ) 2 > c2 we have (d-O +((d->0 2 -c2) 2 r(V.,d,c2) = r(\ Q ,d,c2) = — . d + (d 2 -c2) 2 If we let (d-\ ) = a2, then every point \ = x +iy on the ellipse k8 b^~~ ^-Fi.(d t .c2.) Xo/ d-c d d+c y F. (d, c) will satisfy Figure k . l a2 y a2-c2 Letting X. = x. + iy., we can write r(\.,d,c2) = (a2) + (a2-c2) d + (d 2 -c2) 2 subject to the constraint o p ,d-X.) y. - + a: a2-c2 = 1 (If the ellipse F i (d,c) is degenerate, the above constraint does not hold. If y ^ 0, the degenerate case gives a2 = 0. If y. = 0, the degenerate case gives a2 = c2. In any case, a2 varies continuously with d and c2.) is clear that if \. La in the right hair plane, we pick (d,c2) .. R such that the ellipse F. (d,c) does not contain the origin. Equivalently, there is some (d, c2) eR such that r(\j,d, c2) < 1. The following lemma will show that we need only look in the interior of R for the local minima of r(\. ,d, c2). Lemma k.2 If Re(\.) > 0, then r(\^ f 6.,c2) > 1 on the boundary of R. Proof We have R= ((d,c2)/0 < d, c2 < d 2 } . From Corollary 3.3 we know that r(\.,d, c2) > 1 for d = and c2 = d . From the constraint equation we have (d-x.) 2 < a2 , i — y 2 < a2 -c2 . 2 If c2 < (d-x.) , we can write — l 1 2 ^2 | d-x J + |((d-X,r - c2)^| 1 - v i < r(\.,d,c2) d + (d 2 -c2) 2 2 2 If (d-x.) < c2 < d , we can write | d-x. | + y. 1 1— - < r(\ i ,d,c2) d + (d 2 -(d-x.)2) 2 In either case we have lim r(\.,d,c2) > 1 , (d,c2)^oo x (d,c2)eR which proves the lemma. 50 Since r(k.,&, c2) > 1 on the boundary of R, which would cause the iteration to diverge, we are only interested in local minima that occur in the interior of R. The next two theorems give the minimum of r(\.,d, c2) in terms of X. . Theorem k.3 treats the case X. = x. +iy. , • y. 4 0. Theorem UA treats the case X. = x. . 17 i 7 ii Theorem ^ . $ If A-- = x. + iy., y. 4 °> there is only one local minimum of the function r(A..,d, c2) in the region R. It occurs at the point 2 (d, c2) = (x.,-y.)_, and at this point 2, y i r(\ ,x ,-y ± ) = - Proof In the form r(X. ,&, c2) w 2 2 n 2 X i (x i +y i^ 1 1 >2) 2 + (a2-c2) 2 * 1 d + (d 2 -c2) 2 it is clear that r(X..,d, c2) is a continuous function of d, c2, and a2, for values of d, c2 in R and continuously differentiable except where a2 = 0, a2 -c2 = . From the constraint equation, (*-*i) yf a2 + a2-c2 ' it is clear that a2 is a continuously differentiable function of d and c2 except where a2 = 0, a2 -c2 = . nee y i 4 and •>: - e2 > yV, we have a2 - c2 > in R. Nov.- d = x. and c2 < -y . . i J i This is the ray shown in Figure k .2 and corresponds to the degenerate ellipse through \. . C2 C2 = d' Ut.-yf) Figure I4- . 2 Any local minimum of r(X.,d, c2) must occur at a critical point. A critical point is a point at which either all directional derivatives are zero, or some directional derivative does not exist. We have shown that the points on the ray above are critical points. It will be shown that these are the only critical points so that any local minimum must lie 52 on the ray. It will then be shown that the only local minimum on this ray is at the point (x.,-y. ). First, we divide R into two regions: d > x. and d < x. . Case I: d > x. . i 2 Let c2 - pd . If p < 1 this describes a curve through R. Every point in R is on some curve c2 = pd , p < 1. It will be shown that there is only one possible critical point on this curve. a) If p > 0, then c2 > along the curve c2 = pd 2 . Let a2 1 d 2 6 = — x. y = — = -7T. We can write H c2' ' p c2 1 (P) 2 4 (P- 1 ■if 1 (yf + (7- 1 ■if The directional derivative along this curve becomes IP' IP' 2 12 1 r' (P) 2 (P-D 2 1 1 (yf f (7-D 2 r p' 2 11 (P) 2 (P-D 2 For r' = we must have P' = 0. From the constraint equation we have 2(d-x.) (d-x.) 2 (y.) 2 P' =-s P' = 2pd , 3 (P) 2 (P-D 2 so that p ' = gives d-x. or a2 = d(d-x i ) . Plugging this back into the constraint equation we 1 c2 = P d 2 , a2 = d(d-x. ) This is the only possible critical point along the curve c2 = pd% < p < 1. b) If p < 0, then c2 < along the curve c2 = pd^. We can write (■ 1 ■vf + (1- 1 ■3) 5 (■ 1 + (1- 1 ■yf Taking the directional derivative along this curve we get r' = I 1 1 ' (-p) 2 (l-3) 2 Again we have that r' = implies 3' = 0, and again the only point on the curve satisfying this is 2 2 1 X i + y i 1 5h c2 = pd 2 , a2 = d(d-x.) . v 1 2 c) If p = 0, then c2 = along the curve c2 = pd . We can write r(\,,d,c2) = (a2 i' ' ' d Taking the directional derivative along the curve we get 1 a2' . 1 (a2) 2 1 d ,2 2 d (a2) 2 The constraint equation for c2 = becomes (d-x. ) 2 + y 2 = a2 , so a2' = 2 (d-x. ) l Substituting this in the equation above we get r' = ^— (d(d-x.) -a2) . (a2) 2 d 2 If r' = 0, then a2 = d(d-x. ) . Using the constraint equation for c2 = 0, we have d(d-x.) = (d-x.) 2 + y 2 , or 2 x. + y. d = , x. l c2 = , a2 = d(d-x.) , as the only possible critical point along the curve c2 = 0. p On each curve c2 = pd , p < 1, there is only one possible critical point. The locus of all of these possible critical points is the curve c2 = d(d-ri) , where and on this curve 2 2 X i +y i 71 = — x— l a2 = d(d-x. ) Figure 4.3 shows the curve c2 = d(d-T)) . The possible critical point on any curve c2 - pd 2 is where the two curves intersect. Along the curve c2 = d(d-r)), d > x. we can write l 1 • 1 (d(d-X.)) 2 + (d(d-X.) - dtd-T])) 2 r(X..,d,c2) = i 1—j d + (d 2 -d(d-T])) 2 1 1 (d-x^ + (r\-x ± ) 1 1 ,2 , s2 a + (ti) 56 c2 = d(d-77) c2 = pd' Figure k.3 The directional derivative along the curve is 1 1 1 tA 2 2 ^ r(d +n ) r' = (d-x.) 1 1 I^d-x.) 2 + (t^x.) 2 ) ,2 1 1 (d + t\ ) 111 1 1 1 2 11 1 1 (d 2 (d% 2 ) - (d-X.) 2 ((d-X.) 2 + (T 1 -X i ) 2 )) (d 2 +T) 2 ) 2 (d-X.) 2 d 2 Since d > d-x. and r\ > t\-x.. , then Ill 1 _ 1 d 2 (d - (d-x.) r ((d-x.r H (r,-x. Wo can conclude that r' > for d > x. and that there are no criti- points in the subregion d > x. . Case IT : d < x. . I p As in Case I we look along the curves c2 = pd . Taking directional derivatives along the curves we again find that r' = implies a2 = d(d-x. ) . Since a2 > for d 4- x. and d < x. in this i r i i subregion, this condition is never met. We can conclude that there are no critical points in the region d < x . . We have now isolated the possible local minima to the line d = x. • From the constraint equation we see that if d = x. , then i i' a2 = 2 2 2 y. + c2 for x. > c2 > -y. , i l i 2 for -y. ! > c2 . 2 2 Case I: x. > c2 > -y. , d = x. . We can write 1 (y i + c2) + y i I 2 ^2 r(\,,d,c2) - x. + (x. - c2)' Taking the derivative with respect to c2 we get 1 | 1 (x ± +( X 2 -c2) 2 ) + | 1 ((y^ +C 2) 2 + y. l i' (y? +c2) 2 (x 2 -c2) 2 r' = = 1 (x. + (x 2 - c2) 2 ) 2 58 Since each term is positive, then r' > 0, and r(\.,d, c2) is increasing 2 as c2 increases from c2 = -y. along the line d = x. . i l 2 Case II: c2 < -v., d = x. . i' l We can write 1 r(\,,d,<£) = 1' ' _L x. + (x. -c2) 2 1 v 1 Taking the derivative with respect to c2 we have 1 1 2 x (* ± + ( x i ~ c2 ) ) " g 1 (~ c2 ) (-c2) 2 (x 2 -c2) 2 r ' = 1 1 (x. + (x 2 - c2) 2 ) 2 l l 1 1 2 2 2 £" (x. -c2) (x. + (x7 -c2) ) + c2 I I I 2(x. + (x 2 -c2) 2 ) 2 (x? -c2) 2 (-c2) 2 l i l 1 / 2 o\ 2 2 x. (x. - c2) + x. li l 1 1 1 2(x. + (x? -c2) 2 ) 2 (x. -c2) 2 (-c2) 2 x. i 1 11 2(x. + (x 2 -c2) 2 )(x.-c2) 2 (-c2) 2 nee each beno is positive, then r' < , and L,c as c2 increases toward c2 = -y. alotv, t.he line d = • We can conclude that the only local, minimum of r(\, . 2 occurs at the point (d, c2) = (x.,-y.) and at this point r(\ i ,x i ,-yp x i + ^ x 'i + y i^ 2 This proves the theorem. Theorem h .4 If \. = x., then there is only one local minimum of the function r(\.,d, c2) in the region R. It is at the point (d, c2) = (x.,0), and at this point r(\,,x,,0) = . i' i Proof Since X- = x. , we have a2 = (d-x.) except for the degenerate 2 case. The degenerate case occurs when c2 > (d-x.) , so that X. = x. is _ x ' 11 2 between the foci, d+e and d-c. If c2 ;> (d-x.) , then a2 = c2. We can write ^ |d-x. | + ((d-x.) 2 -c2) 2 r(\.,d,c2) d + (d 2 -c2) 2 - (c2) P ? d + (d~ -c2) for (d-x. ) > c2 , l ' for (d-x. ) < c2 . i' — In this form it is clear that r(\.,d, c2) is continuous in R and continuously differentiable except where 6o c2 = "(d-x.) , and where d = x. for c2 < 1 As in the proof of Theorem k.3, it will be shown that the only critical points of the function r(\.,d, c2) in R are those described above. Divide R into subregions as shown in Figure k.k. Region I is 2 where c2 > (d-x.) , which corresponds to the degenerate ellipse. In 2 this region we have a2 = c2. Region II is where c2 < (d-x.) , d > x., 2 and Region III is where c2 < (d-x.) , d < x. . In these regions we have a2 = (d-x.) 2 . v l C2 = d 2 c2 = (d-x L ) J Figure k.k Case l: c2 > (d-x.) 2 . In this subrty.ion we have r(\«,d,c2) = d + (d 2 -c2) 2 Taking the partial derivative with respect to c2, we get 1 1 \ -i- (d + (d 2 -c2) 2 ) + \ 1 - (c2) 2 ., (c2)" (d"-c2)" 1 (d + (d 2 -c2) 2 ) 2 1 1 1 (d + (d 2 -c2) 2 )(d 2 -c2) 2 + c2 2 III (d + (d 2 -c2) 2 ) 2 (c2) 2 (d 2 -c2) 2 Since each of the terms is positive, we have r' > in this region which implies that there are no critical points in this region. Case II: (d-x.) > c2, d > x. i l Case III: (d-x. )^ > c2, d < x. . v x ' i These two cases will be treated together. In these regions we have a2 = (d-x.) . As in the proof of Theorem k.3, consider the curves c2 = pd 2 , p < 1. Let 62 By the same argument used in Theorem h.3, the directional derivative 2 along the curve c2 = pd is zero only if P' = 0. We have P' = 2x. (d-x.) l l pd 3 Clearly, p 1 / in either subregion II or subregion III, so we may conclude that there are no critical points in subregion II or subregion III, We have isolated the possible local minima to the curve c2 = (d-x.) , i and the ray d = x., c2 < . l Along the curve c2 = (d-x.)^ we have r(\,,d,c2) = d-x. 1 l d + (d 2 - (d-x. ) 2 ) 2 If we take the directional derivative along this curve, we find - (d + (d 2 - (d-x. ) 2 ) 2 ) + (d-x.)(l + X. r> (d 2 -(d-x.) 2 ) 2 for d < x., - (d + (d 2 -(d-x.) 2 ) 2 ) 2 1 ,2^2 (d + (d^ -(d-x.) 2 ) 2 - (d- Xi )(l + 1 1 ) (d 2 -(d-x.) 2 ) 2 v. 1 (d + (d 2 - (d- Xi ) 2 ) 2 ) 2 for d > x., l f -x. 2nS Wj 2 ,2x2 - (d i (d- .(tij)T)(4'.((Ui)') X. 1 V 1 1 >x2 Wj 2 ,_, ,2 N 2 (d + (d 2 -(d-x.) 2 ) 2 )(d 2 -(d-x.) 2 ) ; for d < x., for d > x. Since all of the terms are positive we have and r' < 0, for d < x., ' i' r' > 0, for d > x. . ' l The only local minimum along the curve c2 = (d-x. ) occurs at the point (d,c2) = (x.,0). On the ray d = x . , c2 < we have r(\ i ,d,c2) :-c2)' i 2 2 x. + (x. -c2) li Taking the directional derivative along the ray we have 1 1 2 M 2, 1, „ N 2 -1 l-^y (x. + (x^-c2)") - |(-c2) (-C2) 2 r' = (x 2 -c2) 2 1 (x. + (x 2 -c2) 2 ) 2 X. 1 1 1 1 ' (x. + (x 2 -c2) 2 )( x 2_ c2 ) 2 (. C 2) 2 6k Clearly, we have r' < along the ray, so that r(\.,d, c2) decreases as c2 increases to c2 = 0. We can conclude that the point (d, c2) = (x.,0) is the only local minimum of the function r(\.,d, c2) in the region R. At this point we have r(\.,x.,0) = 0, and the theorem is proved. k.~5 Pair-wise Best Point In this section it will he shown that the locus of points for which two surfaces intersect is a continuous curve through the d, c2-plane . In general, the minimum along this curve can be found in terms of the root of a fifth degree polynomial in a certain interval. The coefficients of this polynomial will be given. It can also be shown that this minimum is the only local minimum along the curve, but it will not be proved here. A special case will be treated in which the minimum can be found in terms of the root of a third degree polynomial. Consider all points (d, c2) €R such that r(\ ,d,c2) = r(\ ,,d,c2) . From Theorem 2.5 we know that \. and \. are on the same member of the i 3 family x. ; then, we have A > 0, B > 0, T > 0. For S = we have d = B , and the ellipse equation becomes a2 + a2-c2 Solving for c2, we get a2(a2 - (A 2 +T 2 )) c2 = — s s — '— (a2 -A^) Suppose that y. 4- Y-j that is, S / 0. Let p = — . The ellipse equations 1 J Cc_ become 66 a \ 2 2 d_X i } y i o and (d-x.) y. + tA- = c2 . P 3-1 Solving for £, we get (x.-x. )(2d-(x.-x. )) P = (x.-x.,(2d-(x.- Xl ,, + (y.-y.)(y. +y .) In terms of A, B, S ; and T, this is (d-B) (d-(B +^)) Substituting this back into the ellipse equation, we get (A \ 2 2 (d-x ) y c2 = — + — - P 0-1 -(d-(B-A)) 2 | (T-S) 2 (d-B) |S S^ < a -( B + T» (d-(B + f)) ST (d " (B+ ~ }) ffd fBA xx2 , A(d-B)(T-S) 2 jj-^ ((d-(B-A)) + — ) (d-(B + f^))(d-(B - A|))( d -( B - 4 )} (d^Bl Notice that so that c2 l-(B - A|))(d-(B - A§)) = § (d-(B - A|))(d-(B - A§)) , a2 = (d-(B - A|))(d-(B - A|)) . For S / we can express c2 and a2 in terms of d. For S = we have d = B and can express c2 in terms of a2. To insure that these parameters describe an ellipse, we must have a2 > , and a2 -c2 > . Lemma k- . 5 The above equations for c2 and a2 describe the parameters of an ellipse passing through \. and \. if d < B when S < , d > B when S > , and d = B, a2 > A 2 when S = . Proof We have seen that d = B when 3=0. For S = we have a2(a2 - (A 2 +T?)) (a2 -A 2 ) 2 a2 -A 68 2 2 For a2 > A we have a2 - c2 > 0. Notice that as a2 approaches A, c2 approaches -co and when a2 gets large positively, c2 also gets large positively; thus, c2 takes on all possible values. When S / we have (a-(B +S) c2 ■ a£ — r^Bi — ' so that (d-(B +S)) a2-c2 = a2 - a2 ^ ST A = a2 TaiBy If S < and a2 > 0, then we must have (d-B) < 0. If S > and a2 > 0, then we must have (d-B) > 0. For a2 we have a2 = (d-(B - A|))(d-(B - A|)) . Since A, B, T, > 0, then if S > and d > B, we have (d-B) + a| > , (d-B) + a| > , so a2 > 0. If S < and d < B, then we have (d-B) + a| < , (d-B) + a| < , so a2 > 0. This proves the lemma. A geometric interpretation of this result Figure h.5. If S > 0, dun y . ■ .v. and any ellipse throuj for S > , z < for S < . Theorem k.6 If S £ 0, the point at which r(\.,d, c2) is minimal along the locus of points for which r(\.,d, c2) = r(X..,d, c2) can be found as the root of the polynomial 5 , _ h- . _ 3 , _ 2 p l z + P 2 Z + P 3 Z + P 4 Z + P 5 Z + p 6 in the interval = 70 iy ii iy i -»»-*-x B d>B S>0 d0 S < s = o Figure k.6 71 .A) S > (-A, for The coefficients are Pl = (2B - A(| + |))(2B + ^- A(|h |)) , p 2 = (2E + ^ - A(| + |))((2AB +ST)(| 4 |) + 1+A 2 ) + B 2 (2B - A(| + |)) 4 B(B 2 -A 2 ) , P3 = hA k - WB(| + |) + A 2 GT((^ + ^ ) - 5(| + |)) 2 ,2 f A 2 B 2 (^ + % + 3) , S T p^ = AST((B - A|)(B - 3A|) + (B - A|)(B - 3a|) ) , p 5 = -3A'ST(2B - A(| + |)) , p 6 = -3A 3 ST(B 2 -A 2 ) Proof We will refer to r(\.,d, c2) as r(z). We can write i (z - SZ) i ((z + A|)(z + A|)) 2 4- (( z + A |)(z + A|)(l - ^-)) 2 r ( z ) = , . _ ? (z + A|)(z + a|)(z - ^) 2 (z+B) + ((z + B) 2 2 -1 A_) Since A, B, T > 0, then r(z) is a continuously differentiable function 72 of z if z > when S > and z <-0 when S < 0. Let p for c2 / 0. We can write a2 c2 > 7 = c2 r r(z) = < 1 1 (p) 2 + (p-i) 2 1 1 for c2 > , (yf + (7-D 2 i l (-P) 2 + (i-p) 2 l i for c2 < , (-7) 2 + (1-7) 2 ^ with a removable singularity at c2 = 0. Taking the derivative with respect to z we have for c2 > r'(z) = 1 ft 1 1 p] 2 1 + 2 1 (P) 2 (P-D 2 1 1 ((7) 2 +(7-D 2 )' 1 1 ((yf + (7-D 2 ) 2 2 12 1 2 2 (7) (7-1) 1 1 ((3) 2 + (3-l) 2 ) 1 1 (P) + (P-D 2 1 1 (yf "4 (7-D 2 P' (pro-D- (7) (7-1)' r(z) / p' (P) (P-D (7) (7-D' and for c2 < we have r'(z) r(z) -P' (-P)"(l-P)" (_ 7 )"(i- 7 )' with q removable singula In 1 we hn •B) 2 7 = (z + A§)(z 4 A§)(z - fO ST fl 1 - A ~stT » (z + B) 2 z - (z + A|)(z f A|)(z - |£) " (z + a|)(z + A|)(z - §0 Q(z) (z + a|)(z 4 a|)(z - |^) where Q(z) is a quadratic. Taking derivatives, we have P' ST A , ST X 2 » ((z + B) 2 +2(z+b)z)(z + A|)(z + a|)(z - ^) 7 = (z + A|)(z + A§)(z + f) O STs , «S W ST> - z(z + B)^((z + A±)(z 4 A£) 4 (z 4 A±)(z - ^L) + (z 4 A§)(z - |i)) C(z)(z+B) (z +a|)(z 4 A |)(z - ^) 7^ where c(z) is a cubic. From here on we will assume S > 0. The arguments are identical for S < 0. Plugging the above expressions into the equation for r'(z), we get r (z) = - /_ i 2 (^) 5 + A ' 1 (z - f)(zf \ (Q(z)) 2 (z + a|)(z + A|) ST This expression has a singularity at z = j which corresponds to c2 = To see that this is a removable singularity, notice that ^,ST N /ST Wt3 ST w ST AT w ST ASx n /ST N ST, ra STx 2 Q(-) = ~(B + T ) , so that ( m f + r _l^ . o We want to show that r'(z) has a root in the interval (0,A) At z = we have C(0) = -ABST, Q(0) = AST . Taking the limit, we have At ■ A we have p p C(A) - ,V "/;' ' (2(A 2 -ST) - A(AiB)) , Q(A) = A (ST(A + B) 2 - (T+S) 2 (A 2 -ST)) , so that (A 2 -ST) P' i (Q(A))^A 2 (S + T)^ I / i r(A) (A; ffSTs2 (2 (A -ST) - A(A+B)) 2 (A 2 -ST) T A) h | \ (Q(a)) 2 r(A) (ST)" / 1 (2(A 2 -ST) - A(A+B)] Op i (A -ST) \ g p i (ST(A+B) " - (S+T) (A~-ST)) We would like to show that r'(A) > 0. Removing the terms known to be positive, we would like to show that ^1 L 4 (2(A 2 -ST) - A(A+B)) (A 2 -ST) I \ (ST(A+B)^ - (S+T) 2 (A" : -ST))^ which is equivalent to 1 A(A-fB) - 2(A 2 -ST) (ST(A+B) 2 - (S+T) 2 (A 2 -ST) ) 2 (A 2 -ST) (A 2 -ST) 76 If (A -ST) > 0, this is equivalent to A(A+B) - 2(A 2 -ST) > (ST(A+B) 2 - (S+T) 2 (A 2 -ST) ) 2 If we square both sides we have the equivalent expression A 2 (A+B) 2 - U(A+B)(A 2 -ST) + l4-(A 2 -ST) 2 > ST(A+B) 2 - (S+T) 2 (A 2 -ST) , which is equivalent to (A -ST)(^(A 2 -ST) - U(A+B) + (B+k) d + (S+T) 2 ) > , which is equivalent to (A 2 -ST)((B-A) 2 + (T-S) 2 ) >0 , which is clearly true. p If (A -ST) < 0, we must show 1 A(A+B) - 2(A 2 -ST) < (ST(A+B) 2 - (S+T) 2 (A 2 -ST) f . Since A(A+B) -2(A 2 -ST) > when (A^-ST) < 0, then squaring both sides and rearranging as before gives the equivalent expression (A 2 -ST)((B-A) 2 + (T-S) 2 ) <0 , which is clearly true. Since r'(0) = -co, and r'(A) > 0, and r(z) is continuously differentiable, we can conclude that r(z) has a minimum in the interval (0,A). At this point we must have r'(z) = 0. From the expression for r'(z) we must have 77 or " r SM = o (QU)) 2 (z + a|)( z + a| (^ 5 (Q( Z )) 5 = > A (z+a|)(z+a|) Squaring both sides, we must have ^Q(z)(z + A|)(z + A§) = C(z) 2 Since C(z) is a cubic and Q(z) is a quadratic, this gives rise to a sixth degree polynomial. One of the roots of this polynomial is at ST z = -jr i which corresponds to c2 = 0, an extraneous root. Factoring out this root, we get the fifth degree polynomial whose coefficients were given in the hypothesis. This polynomial must have a root at the point z e(C,A) at which r'(z) = 0. For S < the interval is (-A, 0). 'This completes the proof of the theorem. It can be shown with arguments based on the derivatives of the fifth degree polynomial that this is the only Local minimum of r(z) We next turn our attention to the case S = 0. The following notation will be convenient. Let y = a2; then, we can write d = B . r A 2 . ■ l J Theorem h.^ If S = 0, the point at which r(\.,d, c2) is minimal along the locus of points for which r(\.,d, c2) = r(X.,d, c2) can be found as the root of the polynomial q x r + ^y + q^y + q^ = 2 2 in the interval (A , B ). The coefficients are: q x - (B 2 + T 2 ) , 2 2 4 2 q^ = 3AV , q^ = -A if B 2 (A 2 + T 2 ) Proof We will refer to r(\.,d,c2) as r(y) . We can write r(y) . iz=L 1 l B 2.y(y-(A 2 + T 2 ))\ (y-A 2 ) 2 2 Notice that for y e (A ,B ), r(y) is a continuously differentiable a2 function of y. If, as in the proof of Theorem h.G, we let p = — ^ , d 2 / 7 = — for c2 f 0, we have r r(y) ] 1 T " \_ 1 (7) 2 (7-D : ' , r'(y) ■ < r(y) 11 11 (-^) 2 (i-3) 2 i-yfd-y) 2 for c2 > , with a removable singularity at c2 = 0. In terms of y we have 3 = (y-A 2 ) (y-(A 2 +T 2 )) ' B 2 (y-A 2 ) y(y-(A 2 + T 2 )) 3-1 = (y-(A 2 ^T 2 )) a = -(y 2 -(A 2 +B 2 + T 2 )y-fA 2 B 2 ) y(y-(A 2 + T 2 )) Taking the derivative we have P' ■r 2 (y-(A^4- T 2 )) 2 7' = B 2 (y 2 -2A 2 y +A 2 (A 2 +T 2 )) (y(y-(A 2 + T 2 ))) 2 8o Combining the equations above, we have r . ( y ) = iM I ( B(y 2 -2A 2 y+A 2 (A 2 + T 2 )) _ , (y - (A 2 +T 2 ))(y-A 2 ) 2 \y(-(y 2 - (A 2 + B 2 + T 2 )y +A 2 B 2 )) 2 We would like to show that r'(y) has a root in the interval 2 2 (A , B ) . We can write t M \ r ( A ^) i- 1 i ^B- A ^^ lim r'(y) = -A^_ . lim . (.(_)) y-^A^ y-A^ « (y-A 2 ) 2 If we let y = B , we have r . (B 2 ) = £i£l 1 / b((b 2 -a 2 ) 2 +a 2 t 2 _ T (B 2 - (A 2 +T 2 ))(B 2 -A 2 ) 2 \ B 2 (B 2 T 2 ) 2 r(B 2 ) 1 / (B 2 -A 2 )(B 2 -(A 2 +T 2 )) 2 B T (B 2 -(A 2 + T 2 ))(B 2 -A 2 ) 2 ' 1 r(B g ) (B 2 -A 2 ) 2 2 B 2 T Since lim r'(y) = -«, r'(B ) > 0, and r'(y) is continuously y-A 2 2 2 differentiable in the interval (A ,B ), then r(y) must have a minimum in 2 the interval (A' , B' K From the equation '(y) we see thm re have -2A 2 y +A P (A 2 4 T 2 )) = T y(-(y 2 - (A 2 +B 2 +T 2 )y +A 2 B 2 )) 2 If we square both sides of this equation and rearrange terms, we have o p a fourth degree polynomial in y. This polynomial has a root at y = A +T , which corresponds to c2 = 0. Factoring out this extraneous root we have the third degree polynomial whose coefficients were given in the hypothesis Any point y such that r'(y) = will be a root of this polynomial. If then 3 2 Q(y) - q x y + q^y + q^y + q^ , Q 1 (y) = 3q x y + ^y + q 5 Taking the discriminant of Q'(y), we have P h. h. li P P P l*q| - 12q 1 q = 36A^B - 36A B (B + r T) = -36a^b 2 t 2 < o . We may conclude that Q'( z ) has no real roots; thus, Q(z) has only one real root, the one known to give the minimum of r(y) . This root is the only local minimum of r(v), and the theorem is proved. The point at which the minimum occurs will be referred to as the pair-wise best point, and the associated ellipse through X. and X. will be referred to as the pair-wise best ellipse. 82 k.k Three-way Point If the functions r(\ ,d,c2), r(\ ,d,c2), and r(X ,d,c2) take on the same value at some point (d, c2) eE, then there is an ellipse, a member of the family ^£(d, c), passing through X., X., and \. In this section it will be shown that if such an ellipse exists, it is unique. An existence criterion and parameters of the ellipse will be found in terms of X. = x. +iy., X. = x. +iy., and X = x. + iy . We will 111JJJ KKK assume that < x. < x. < x. and < y.,y.,y, • J. J K. 1 J K Lemma k . 8 If there exists an ellipse symmetric with respect to the real axis passing through the distinct points \., X., and X , then it 1 j K. is unique. (A general ellipse is determined by 5 points.) Proof The general equation for an ellipse symmetric with respect to the real axis is 2 2 Ax + By + Cx = 1 Suppose the system u A Ya 2 A l A \ l i \ X. *k v W has more than one solution, say A. M and I W IA K\ M W \w VI then the vector IA IA VI -1 / I 7 / is a solution for any w. Suppose p / 0; then, for u> = -B /p we have X. y i X. i 2 x . 2 X . *k 2 *k W A \ h\ w VI which implies that Az + Cz - 1 = has three distinct roots implies that We can conclude that ,3 = 0. This in turn / 2 x. l o J 2 y. x. 9 ± i 2 M ^k *k / W l°\ W or that az + yz - 8^ has three distinct roots. We can conclude that a = 3 = 7 = 0, which proves the lemma. Theorem k.9 There exists an ellipse symmetric with respect to the real axis passing through X., X., and K if and only if 1 j K ( v x i )(y k- y i } < ^i^yH 5 • If the ellipse exists, it is determined by the parameters , 2, 2 2x 2, 2 2 N 2, 2 2 U d _ 1 (y i (x r x k ) + y j ( V x i } + y k( x i- x j» ( (y^(x -^ +y?(vxi) + yk (x i- x i }) ' a2 = d' 2 (y i X i X k (X i- X k ) + TO^i 1 + ViVV*^ ^.rV +y ^ ( v x i } + 4 ( v x P ) op a2 ii (y j (x 3"> ) + y j (x fc" x i? +y ^v x 3 )} (x i -x.)(x.-x k )( V x i ) Proof If an ellipse is to pass through the three points X., X., and X, , 1 J K. then it must be a member of the family of ellipses passing through X. and X, . Consider the arc of the ellipse connecting X. and X, as we k ik range over all possible ellipses through X. and X (see Figure ^.7) • This arc sweeps out the region bounded by x = x., x = x., and above the ± J limiting ellipse. For y. ^ y,, the limiting ellipse is a parabola with X K the equation x = Py 2 + Q , - X PSE where and Figure k.l V x i 2 2 y k~ y i y k- y i/ For y. = y it is the line X K. y = y ± = y k The point X. is on one of these ellipses if and only if it is in this region, that is, if and only if x. < Py. + Q for y k - y. > , x. > Fy + Q J J for y k - y ± < , v . > y. for y k - y k = . (See Figure k .8) 86 Equivalently, we have ( V X i ) 2 2 (x -x ) < (y.-y,) for y - y > , J x (y^-y?) J k x (V*^ 2 2 (x j- x i ) > 7-2-2: (y j- y i> for y k - y i < ° > (y k -y i ) < (y^) for y k - y± = , which can be written (x r x i )(y k~ y ? ) < ( v x i )(y r y i } for all cases. To find the ellipse parameters d, c2, and a2 for an ellipse through \., \., and \, we require that they satisfy the three constraint 1 3 x equations: (a ^ 2 2 (d-x. ) y } a2 + a2 - c2 = ' (a \ 2 2 (d-x ) y. pi a + a — - 1 y a2 a2 - c2 ' 2 2 (d-X k) y k ^ j a2 + a2 - c2 = " Assume that \. , \., and \_ satisfy the existence criterion. Then, \. must lie in the region described above (see Figure ^.8). We can establish three cases: Case . Cf y k -y. , then y .-.v. Case II. It' Y k ~y i < ! n y i" y K ' ie in. If y k -y i ■ 0, then y,-y k y.-y. > 0. *i •Xi CASE I CASE n CASE HI Figure 4.8 Case I. We know from Section k.J that if y. -y. 4 0, we can use J k i constraint equations 1) and 5) to get an expression for a2 in terms of d. We have a2 = (d - ( \ +x, V x i y k +y i s , , . A +X i V x i y k" y i 2 y, -y. ^k J i ■))(d - yi + y- J k ' i )) (d-P 13 )(d-Q 13 ) Likewise, since y. -y. / 0, we can use constraint equations l) and 2) 88 to get x.+x. x.-x. y.+y. x.+x. x.-x. y.-y. a2 = (d- (^-V^( d - ^"^y^ = (d-P 12 )(d -Q^) . Setting the two right-hand sides equal and solving for d, we get (d -P 13 )(d -Q 15 ) = (d -P 12 )(d -Q^) , and ^3^3 " P 12 Q 12 Plugging in the proper expressions, we get 2, 2 2, 2/2 2v 2, 2 2, d _ l yj^j-V +y j ( V X i } +y k ( V X j ) " y^x -x^ + yj(x k -x.) + y k (x.-x.) (Notice that a positive denominator is equivalent to the existence criterion. ) If we write d = — — and plug this back into the equation d K a2 = (d-P 13 )(d-Q 15 ) = d 2 - (P 13 +Q 13 )d+P 13 Q 15 , we set a2 . a 2 - I (p 15 +Ql5 ) I + P^ ^(P 13+ Q I3 )j-gP 13 Q 13 K 2K the proper expressions ind rearrani Li ms, wo gi a2 m i _ fojVV*^ ' ^V*!* y ? (x fV ' y j ( W ' y k' l- x j 5 From Section U o we know that since y / y. we can also use constraint K 1 equations l) and 3) to get 2 2 V x i i y k" y i (d-(JL-i + i^— i)) 2 2 X-x. c2 = a2 L_i_ X, +x. (y k- y i } a2 f 1 — — ( 2d ( VX .). ( ^)) Again using d = — — , we have 1 J 2 K' (y k" y i )K c2 = a2 I 1 - K ((x k -x i )j -(x^-x^)K) which yields c2-a2fl ^V**' IJ^VV + y k ' + v i- X.ell at that point. That is to say, the associated ellipse through X. contains the positive hull in the closure of its interior. From Theorem k .3 we know that the only local minimum of r(\.,d, c2) occurs J at (d, c2) = (x.,-y.). The associated ellipse is the degenerate ellipse J J with foci at X. and X.. If H contains more than one eigenvalue, say J J K ^ X., then K must lie outside the degenerate ellipse. Thus, we have r(W-yj> > r(X.,x.,-y^ , which proves the lemma. We now know that, the mini -max solution is either a pair-wise best point or a three-way point. Tlieorem l4-.ll A pair-wise best point whose associated ellipse contains the positive hull, H + , in the closure of its interior is the mini-max solution. If no such pair-wise best point exists, then the mini-max solution can be found among the three-way points whose associated ellipses contain the positive hull. Proof Suppose the pair-wise best point associated with X. and X , call it (d., ,c2., ), is such that the associated ellipse passing through X. ' jk' ok J and \, contains H + . Then, we have k ' 92 r( W c V = ™* + r( W c V min max r(\.,d,c2) (d, c2)eR A..eH + X min max{r(\.,d,c2),r(\,,d,c2)] (d,c2)eR 3 K = r( VV c2 ^ ? which proves the first result. The last equality is a result of Lemma J+.10 and the Alternative Theorem. The mini -max problem over two functions must have a pair-wise best point for its solution. Suppose that no pair-wise best point qualifies. From the Alternative Theorem and Lemma 4.10, we know that the mini -max solution is a three-way point whose associated ellipse contains the positive hull. This proves the theorem. To find the mini-max solution, one must systematically take each pair of eigenvalues in the positive hull and find the pair-wise best point. If the associated ellipse contains the positive hull, then the point is the mini-max solution. If no pair-wise best point qualifies, one must take each combination of three eigenvalues and look for a three- way point. If it exists and its associated ellipse contains the positive hull, the point is a candidate. The mini-max solution is the three-way candidate that yields the smallest convergence factor, r(\, d, c2) . Notice that those eigenvalues in H 4 " which are involved in some combination of three eigenvalues that produced a three-way candidate are exactly the key elements described in Section .. . [n bl ' - 3e be mini -max solution, the key « t,s are determined. Thus, we may discard those eigenvalues in H" f which are not key elements Reducing the number of eigenvalues in If' redu tie number- combinations of three that must be tried in subsequent searches for a mini -max solution. 9k 5. ADAPTIVE PROCEDURE The algorithm for finding optimal iteration parameters developed in Chapter 2 and Chapter 3 depends upon knowledge of the spectrum of the matrix A. In practice, however, little is known about the spectrum of A. In this chapter a procedure will be developed for estimating the hull of the spectrum from information acquired during the iteration, allowing dynamic improvement of the iteration parameters. Section 5.1 will show how eigenvalue estimates can be extracted from the sequence of residuals. Section 5-2 will show how these estimates can be used to build an approximate hull of the eigenvalues of A. 5-1 Modified Power Method Suppose that the iteration parameters d and c2 have been chosen from some prior knowledge of the matrix A. After n steps of the Stiefel Iteration based upon d and e2, the error can be expressed as e = P (A)e_ . n n Multiplying this expression by A, we can write the residual as r - P (A)r n . n n Suppose that r can be written as a linear combination of the eigenvectors of A; then, we have r_ = Z (a. v. + |m | > jm. | for i J d, s . 1 d — ' s ' i ' ' ' Then, for large n the residual can be written as n, v _n, v n, v _n, N r = m (a,v, ) +m, (a,v, ) +m (a v ) +m (a v ) +€ , n add d v d d s v s s s v s s n where c is small relative to the other terms. The residual is nearly n J a linear combination of the eigenvectors associated with the dominant convergence factors. Likewise, we have 96 n+1, v —n+1, v n+1, N —n+1, >, r • = m, (a, v..) +m, (a,v,) +m (a v ) +m (a v ) +e . n+1 d dd' d v dd y s s s' s ss y n+1 n+2, s -n+2, \ n+2, N -n+2, v r = m (a,vj +m, (a,v,) +m (a v ) +m (a v ) +e „ , n+2 d dd d dd s s s s s s n+2 ' n+3 / n _n+3 / \ n+3 / \ _n+3 / \ r = m , (a,v,) +m, (a,v,) +m (a v ) +m (a v ) +e , . n+3 d d d' d dd s ss s ss n+3 n+4, v _n+4, s n+4, N _n+4, N r ,,, = m, (Q 2 -c2) 2 1 d +(d 2 -c2) 2 This function maps the members of the family *£(d, c) onto circles centered at the origin (Section 2.1). The radius of the circle is !m(\) j = r(\,d, c2) , the convergence factor associated with the ellipse. The region of convergence in the \-plane, shown in Figure 5-2, is mapped injectively onto the region in the m-plane shown in Figure 5.2. Since the branch 100 X-PLANE m- PLANE Figure 5.2 of cosh" " with positive real part is used, we have |M(\) cosh" (— : — ) -cosh" (-) ) \ c V C > e ■cosh (-) <§)♦ ((f) 2 -ifi c2 d +(d-c2) Let g = d+(d 2 -c2) 2 ; thou, we can write with the restriction ^ = d - - (^ ( -) , - g If poor separation of the eigenvalues of M(A) causes the modified power method to yield an eigenvalue approximation m such that Iml < |C2 g then it must be discarded. This map is conformal, but it is not linear. If m lies in the convex hull of [m. ] , X does not necessarily lie in the convex hull of [X. } . An underestimation of the imaginary part of m. may cause an over estimation of the real part of d-\. . This warping is slight, however, and shows little effect in practice. 5.2 Procedure and Example A four step procedure for dynamic improvement of iteration parameters is outlined below. 1. Choose initial d and c2. The initial choice of d and c2 must be based upon prior knowledge of the matrix A. In Section 1.2, crude bounds were established which gave rectangular regions known to contain the spectrum of A. These regions could be used to choose initial d and c2. If the system were scaled so that each diagonal term of A had the value 1, then d = 1, c2 = 0, representing circles centered at 1, could be used. 102 2. Iterate. Perform a predetermined number of steps of the Stiefel iteration based upon d and c2, and store the last five residuals. The number of steps should be large enough to insure separation for the modified power method. 3- Get eigenvalue estimates. Using the modified power method described in Section 5«1> obtain eigenvalue estimates. Add the new eigenvalue estimates to any previous eigenvalue estimates and form the positive hull. k. Choose new d and c2. Using the algorithm outlined in Section h.^, find the optimal d and c2 for this set of eigenvalue estimates. Repeat 2, 3, and k. These steps will be referred to as a cycle . To illustrate how this procedure might work, suppose the hull of the eigenvalues of A is as shown in Figure 5-3 • Suppose d and c2 have been chosen as indicated in Figure ^.h. Several members of ^(d, c) are also shown in Figure ^.k. It is clear that X and X are the dominant and subdominant eigenvalues for this choice of d and c2. After a sufficient number of steps of the Stiefel iteration, the modified power method will give eigenvalue estimates p and p shown in Figure 5«5« If we let d+c and d-c also be eigenvalue estimates, then the approximate hull is as shown in Figure 5-5. A choice of d and c2 based upon this approximate hull yields the d and c shown in Figure 5-6. Some members of the new family *%-(&, c) are shown in Figure 5-6, and it The eigenvalue estimates were chosen for illustrative purposes. The values of d and c2 were computed from these values, and the figures were plotted by the IBM CALCOMP Plotter at the University of Illinois. 4 3.. 2.. \1 1 — 5 — t — & -1 .. -2.. -3.. -4.. [7 8 9 10 Figure 5.3 Figure 5X 101+ Figure 5 '5 Figure 5.6 is clear that X r and \ f are the dominant and inant eigenvalues for this choice of d and c2. After a sufficient number of steps of the Gtiefel iteration, the modified power method will give eigenvalue approximations p an co U) UJ UJ q: K INPUT VARIABLES: A, b, x, d, C2 N, ND.ICYCLE IMAX, ERBNO COMPUTE FIRST RESIDUAL r = b-Ax DX Dx = -±-r a PERFORM ICYCLE STEPS OF STIEFEL ITERATION, SAVING THE LAST FOUR RESIDUALS i TEST TO HALT STOP ADAPTIVE PROCEDURE F d AND C2 HAVE CHANGED THEN RESTART STIEFEL ELSE RESUME STIEFEL Figure 6 . 1 The initial choice of d and c: must also be par.. i a common statement. This choice can be based on rough bo ■ those established in Section 1.2. A poor choice of d and c2 will cause the first cycle to be spent in obtaining eigenvalue estima 1 (Section 5 .2). Care should be taken to avoid a choice that causes overflow before the end of the cycle, at which time the program will adjust itself. The positive hull of the eigenvalue estimates is stored in the doubly linked list CH(25,2) with link array ICH(25,2). The foci d+c and d-c associated with the initial choice of d and c2 are used as initial eigenvalue estimates. For that reason d and c2 should be chosen so that d+c and d-c lie inside the convex hull of the eigenvalues of A. If the user wishes to supply a priori eigenvalues, he must initialize CH and ICH in the subroutine IWIT (Appendix A) . The number of iterative steps per cycle must be set by the input parameter ICYCLE. In practice, little difference was found for values of ICYCLE from 15 to 50. The value 20 was used in most of the results that follow. The other input parameters are IMAX, the maximum number of iterations allowed, and EKBND, the acceptable bound on the i -norm of the residual vector. Since the hull of the eigenvalues of A is found by the adaptive procedure, a bound on the modulus of the smallest eigenvalue is available. This may be incorporated in the error bound by replacing EKBND by EKBND* (D-DSQRT(A2)) in the test to halt in subroutine TCHEB. The adaptive procedure, called from subroutine TCHEB, is broken into four parts (see subroutine ADAPT, Appendix A). The first 110 part, subroutine LSTSQ,, involves' finding the least squares solution to the system where the columns of S are the previous residuals (Section 5.1). Subroutine LSTSQ converts this system to the equivalent system T T S S, i * Q, n = - S R . . 4X4 4X1 4X1 Since the residuals may become small, this system is multiplied by a T T constant so that S S(l, l) = 1.0. The matrix S S is symmetric, so it can be computed by taking 10 inner products. Likewise, 4 inner products are T required to compute -S R. The normalized system is, in general, less stable than the large system, but results have shown that this instability does not warrant the extra work involved in solving the large system. The normalized system may be solved in many ways. If one T eigenvalue of M(A) dominates strongly, however, the matrix S S may be singular (Section 5.1), in which case a least squares solution is desired. For that reason a Bidiagonalization method with reorthogonalization was used to solve the 4 x4 system (Golub and Kahan [11]). The subroutines for the Bidiagonalization method appear in Appendix B. Subroutine EVS finds the roots of the polynomial associated with the least squares solution Q and converts them into eigenvalue estimates (Section 5*1) • Any root finding routine may be used to find the roots of the fourth degree polynomial. A library routine, RSSR, from the University of Illinois F0RTU0I Library was used by the author. A listing of RSSR appears in Appendix C. The subroutine HULL adds the new eigenvalue estimai the previous eigenvalue estimates and forms the positive hull (Sectio.-. [f none of the new eigenvalue estimates are members of the po hull, control is returned to the subroutine TCIIEB, and the iteration is resumed. The subroutine ELLIP finds the best ellipse enclosing the positive hull (Section ^.5)« If the best ellipse is a three-way ellipse, this subroutine also determines the key elements of the positive hull and discards the others (Section 3-2 and Section k.k). Finding a pair-wise best ellipse involves finding the root of a fifth degree polynomial in a certain interval (Section h.J>) . This was done by the routine ZEROIN, which appears in Shampine and Allen [l8]. A version adapted for the purposes here appears in Appendix C. 6.2 Competing Algorithms Two competing iterative methods for solving the nonsymmetric linear system Ax = b are the Bidiagonalization method (Golub and Kahan [11], Paige [17]) and the method of Conjugate Gradients applied to the T T equivalent system, A Ax = A b (Hestenes and Stiefel [12]). The Tchebychef algorithm requires more storage than the other two in that four previous residuals must be retained for the adaptive procedure. The other methods require more work per iterative step. Table 6.1 shows a comparison of the work per iterative step and storage requirements of the three methods. 112 Table 6.1 mul/step add/step STORAGE TCHEB (ND+2.7)N* (nd+3-7)n (ND+9)N BIDIAG (2ND+7)N (2ND+5)N (nd+5)n CG on A T A (2ND+6)N (2ND+6)N (ND+5)N Hero N is the dimension of the system, and ND is the number of columns of length N required to store the matrix A. If A is a 5-point difference operator, then ND = 5* Finite element matrices usually require more storage. For instance, when cubic shape functions are used in two dimensions, we may have ND = 33 and in three dimensions we may have ND = 135 (Zienkiewicz [27]) • As the storage required for the matrix A becomes greater, the extra storage required by the Tchebychef algorithm becomes less of a factor, but the work required by the other methods remains twice as much as that required by the Tchebychef algorithm. Both the Bidiagonalization method and the method of Conjugate T T Gradients on the equivalent system, A Ax = A b, are sensitive to the T condition of A A, whereas the Tchebychef method is sensitive to the condition of A. Because of this advantage and the advantage of less work per iterative step, the Tchebychef method was considerably faster than the other methods on a series of test problems. In all of the tests that were run, the Bidiagonalization method was slightly better than the method of Conjugate Gradients. For that reason results are shown for the Bidiagonalization method only. The adaptive procedure in the Tchebychef algorithm requires l^N multiplications and additions (Section 6.1). If an adaptive procedure is carried out every 20 steps (ICYCLE = 20) then the average overhead is .7N per step. Results The convergence properties of the Tehebychef algoritlu nd only upon the spectrum of the matrix A and not upon any special zero structure. It is desirable, however, to test the algorithm on easily constructable systems whose spectrums have known properties. Consider the second order linear differential operator on a a rectangular domain: - |- (A (x,y) A) . ± ( A (x,y) A) + B _(x,y) A + B p (x,y) |- + Q(x,y) , dx J- dx dy d dy L dx d dy where the functions A , A p , B, , B p , and Q are continuous and dif ferentiable This operator can be broken into two parts: - |- (A (x,y) ±) - A. (A (x,y) f ) + Q(x,y) , dx L dx dy d dy and yx,y) I + B 2 (x,y) A. With suitable boundary conditions, the first part is a self-adjoint operator while the second is not (Dunford and Schwartz [^]). Accordingly, if we approximate this operator by a finite difference operator on a regular grid, the first part gives rise to a symmetric matrix, and the second part gives rise to a nonsymmetric matrix. Let h be the grid size and (x.,y.) be the grid points with the standard left-to-right, down-to-up ordering of node points. Using central differences and Dirichlet boundary conditions, let A be the 5-point difference matrix associated with the differential operator. The matrix A can be written in two parts: A = M+N, corresponding to the two parts of the differential operator. The matrix M is symmetric and, if k is the number of grid points per column, can "be written in block form as 111* M \ 1 M2 , 'k-1 'k-1 \ The M. 's are tridiagonal and the C. 's are diagonal. If l is the number of gridpoints per row, we have and M. = 1 li 2i 2i '2i "3i " b £-li ' Vii a n '2i n where a i,j -7? lA i (x i "l' 3 ^ " \ (x i 4 ?' y j } + A 2 (x.,y. -§) + A 2 (x.,y. + §) } + Q(x.,y.) , 1a/ h x 1 A / h \ C ij = ^2 A 2 ( V y j + 2> The matrix N is nonsymmetric and can be written in block form as N = N F 1 1 F N F 2 2 2 -F N 3 3 k-l ■ F. N. k k Again, the N. ' s are tridiagonal and the F. 's are diagonal. We have N. r a li 1 d£1 d 21 3i "£-li *-d . n6 and F. = l "2i where d. . = -prr- B- (x.,y.) , f iJ " 2h %^'j) a) Nonsingular 5-point Difference Matrices If A (x,y), A (x,y) > and Q(x,y) > 0, then M is a positive definite matrix. If B- (x,y) and B p (x,y) are constant functions, then N is antisymmetric (Varga [22]). From Section 1.2 we know that the eigenvalues of A = M+N lie in the right half plane. In particular, let A 1 (x,y) = Ag(x,y) = 1 , Q(x,y) = , B 1 (x,y) = B 2 (x,y) = P , on the square region < x < 41, < y < kl , with grid length h = 1. Then, A is a 1600 xl600 matrix and L17 M, = -1 ' . '-1 ■ 1 ' k c. 1 -1 -1 N. l 2 F. = 118 The eigenvalues of the positive definite matrix M lie in the open interval (0,8), while the eigenvalues of the antisymmetric matrix N lie in the interval (-23i,2(3i) along the imaginary axis. If X is an eigenvalue of A, it is contained in the rectangular region Re(\) e (0,8) , Im(\) e (-23,23) . The Tchebychef algorithm was tested against the two competing methods for values of 3 ranging from .1 to 40 (see Table 6.2). Figures 6.2(a) -6.10(a) show the hull of the eigenvalue estimates computed by the Tchebychef algorithm and the best ellipse enclosing the approximate hull. For small 3 the matrix A is nearly symmetric which yields a hull close to the real axis (see Figure 6.2(a)). For large 3 the matrix A is nearly antisymmetric which yields a nearly verticle spectrum (see Figure 6.10(a)). Table 6.2 Figure # (3 J.XJ.J- 1 d OJ.O.X 1 c ■d -iiiaj. 1 C Convergence Factor Rate of Congergence ICYC 6.2 .1 4.0 3.87 1+.69 k.66 .9075 .04215 20 6.3 .4 4.0 3.87 If. 01 3.87 .9502 .02217 40 6.4 .8 k.O 3.9^ 3.31 ' -9737 .01155 20 6.5 2 k.O 3.93 3.09i .9571 .01906 20 6.6 4 k.O 4.05 8.86i .9558 .01964 20 6-7 8 k.O 15i 3.85 l8.44i • 9324 .03039 20 6.8 10 k.O l4.l4i 4.05 20.l8i .9494 .02254 20 6.9 20 k.O 31.62i k.2k 40.391 • 9595 .02024 20 6.10 4o k.O 75.00i 5.28 85.281 .9769 .01015 20 2 \i -ll -2._ 1 2 § S 5 I 5 = % 9 'lO Figure 6.2(a) Best ellipse for 3 = .1 100.0 150.0 200.0 250.0 300.0 350.0 HORKt NUMBER OF TCHEB ITERATIONS UOO.O = k-0 128 Figures 6.2(b) -6.10(b) show a comparison of the error suppression of the two methods, Tchebychef and Bidiagonalization, versus the work required. Error suppression was measured by the log of the relative error; that is, by tog.(||e ||/||eJ|) where ||e || is the I - th norm of the error vector at the n step. Work was measured in terms of the number of Tchebychef iterations. Recall that Bidiagonalization required about wice as much work per iterative step. The initial choice of parameters d and c2 was based on the rectangle known to contain the spectrum of A. In some cases a rather poor choice was used to show that the adaptive procedure would still work. Table 6.2 contains the initial parameters as well as the computed o optimal parameters for each test. (Recall that c2 = c .) The convergence factor associated with the best ellipse and rate of convergence are also given in Table 6.2. (The rate of convergence is -tog. of the convergence factor. ) T For A nearly summetric, the condition of A A is significantly worse than the condition of A. Since Bidiagonalization is sensitive to m the condition of A A, it did rather poorly for nearly symmetric A (see Figures 6.2(b) -6.4(b)). For large p the two methods were more comparable, but the Tchebychef method still held a significant advantage (see Figure 6.10(b)). For these tests the mesh space used was h = 1. The symmetric part of A is multiplied by a factor of l/h , and the antisymmetric part, associated with the first order terms, is multiplied by a factor of l/h. For smaller h the matrix would be more nearly symmetric, the type of system for which the Tchebychef algorithm holds the greatest advantage. In some tests the region of convergence associated with the initial choice of parameters d and c2 did not include the entire spectrum. The error grew in the first cycle. Eigenvalue estimates were extracted, however, and the solution vector was set back to the initial guess. No progress was made toward the solution, but better values for d and c2 were found (see Figures 6.6, 6.8, 6.9). In the test with 3 = k (see Figure 6.6) two cycles were required before enough information was obtained to choose a d and c2 that produced convergence. In the tests in which the region of convergence associated with the initial choice of parameters d and c2 contained the entire spectrum, the very rapid initial drop in error was due to the suppression of eigenvalues near the center of the region. The asymptotic rate was achieved after this initial drop (see Figures 6.2, 6.3, 6.k, 6.5, 6.7). b) Singular Problems Suppose the matrix A has eigenvalues in the open right half plane except for an eigenvalue at X = 0. The Tchebychef algorithm can be applied to this case. Since the scaled and translated Tchebychef polynomials are all normalized at the origin (P (0) = l), the component of the initial guess in the direction of the null space of the operator A will not be altered by the Tchebychef algorithm. Indeed, if the parameters d and c2 are chosen to fit the convex hull of nonzero eigenvalues, then a solution will be found containing the component of the null space present in the initial guess. The error in the direction of the nonzero eigen- vectors will be suppressed as usual. A problem would arise, however, if the adaptive procedure were to yield an approximation to the zero eigenvalue . In this case the value 130 nearly zero would be admitted to the approximate hull and cause a choice of d and c2 that would yield a very slow rate of convergence. However, the eigenvalue estimates are based upon the successive residuals, and if the target vector b is in the range of A, then the residuals are orthogonal to the null space of the matrix A. Since the residuals contain no eigenvectors associated with the zero eigenvalue, the adaptive procedure will not yield an approximation to the zero eigenvalue. To test the Tchebychef algorithm on a singular system, consider the differential operator above with Neuman boundary conditions. If Q,(x, y) = 0, then the constant function is a solution of the homogeneous problem (Mx, y ) T-) - t- (M x >y) r-) + B i (x > y) r- + B ? (x > y) f-W>y> = 1 dx dy d ay L dx e dy/ If we approximate this operator on the same grid as in part a), the matrix A will be the same except that the diagonal must be altered so that the constant vector is in the null space of A. As before let A 1 (x,y) = A 2 (x,y) = 1, B x (x,y) = B 2 (x,y) = (3 . Tests were run with 3=1 and 3 = 10. Figures 6.11(a) and 6.12(a) show the approximate hull of the nonzero eigenvalues and the best ellipse enclosing it. Figures 6.11(b) and 6.12(b) show the error suppression of the two methods, Tchebychef and Bidiagonalization, versus the work required. Error was measured orthogonal to the null space of A. Figure 6.11(a) Best ellipse for 3=1 100. 150.0 200.0 250. 300. WORK: NUMBER OF TCHEB ITERATIONS 350.0 400.0 Figure 6.11(b) Work required for 3=1 132 Figure 6.12(a) Best ellipse for p = 10 0.0 60.0 100.0 IH.0 200.0 250.0 300.0 HOnKt NUN8CR OF 7CHEB JIEROTJONS 3S0.0 uoo.o Figure 6.12(b) Work required for $ = 10 Starting values for the parameters are listed in fable I .' aa ./° ►-I.O. »— 1 1 u.3r \(.7) n.o\ CM 1 i 1 1 1 — — i 1 1 0.0 20.0 40.0 60.0 80.0 100. NUMBER OF ITEWTTJ0N3 120.0 140.0 160.0 Figure 6.1^(b) Work required for indicated value of a and (3 = k of Figures 6.13(a) and 6.14(a) with the ellipses of Figures 6.3(a) and 6.6(a). It is clear from Figures 6.13(b) and 6.14(b) that the conditions of the systems have been greatly improved. Compare the rates of convergence for the same matrices in Table 6.2 and Table 6.4 Figure # 3 a Factor Convergence ICYCLE 13 . .4 .3 -7897 .1025 20 13 .4 .7 .6185 .2086 20 13 .4 1.0 .4371 -359^ 20 Ik k .3 .2479 .6057 20 14 4 .7 .5371 -2699 20 14 4 l.O .8319 .0799 " 20 Little is known about the effect of the parameter a upon the spectrum of (A+B(a))~ A for nonsymmetric A (for symmetric A see Dupont [5])- It is apparent from the results that the condition of the matrix (A+B(a))~ A is very susceptible to the choice of a. Notice how the ellipses in Figures 6.13(a) and 6.14(a) differ with different a. The accuracy of these ellipses is questionable, however, as convergence occurred too quickly to gain adequate information about the eigenvalue structure . The greatly improved condition and rapid convergence achieved with SIP more than justifies the extra work per step and presents the explanation of the role of a as an interesting open question. 138 d) Nonhomogeneous Region In order to test the algorithm on a more difficult system, the region < x < k±, < y < kl was broken up into subregions (see Figure 6.15). The functions A 1 (x,y), A g (x,y), B 1 (x,y), B (x,y) we: given values as follows: ;re Region i \ A 1 (x,y) = 10 A 2 (x,y) = 10 B 1 (x,y) = B 2 (x,y) = Region B A ] _(x,y) = 10 A 2 (x,y) = 10 B 1 (x,y) = 10 B 2 (x,y) = 10 Region C A ] _(x,y) = 100 A 2 (x,y) = 1 B 1 (x,y) = 10 B 2 (x,y) = Region D A 1 (x,y) = 1 A 2 (x,y) = 100 B x (x,y) = B 2 (x,y) - 10 A grid with mesh space h = 1 and Dirichlet boundary conditions were used to generate the associated 5-point difference operator, A, of dimension l600. The adaptive Tchebychef algorithm was used on this system, both by itself and with SIP. The results are shown in Figure 6.l6. Figure 6.16(b) shows the approximate hull of the matrix A. Figure 6.16(a) shows the approximate hull of the matrix (A+B(a))~ A with a = -5- Figure 6.16(c) shows the error suppression of the Tchebychef 41 40 35 30 B 25 CO < 2° I 15 10 D L 5 10 15 20 25 30 35 40 41 X- AXIS Figure 6.15 llj-0 1_ -1 o £~~~~~* *- 10CL -1001 100 100 500 ot Figure 6.16(a) Spectrum of (A+B(a)) _1 A Figure 6.16(b) Spectrum of A 60.0 90.0 120.0 150.0 ItO.O 218.0 WORK! NUMBER OF 1CHEB HERAT] ONS THO.O 210.0 300. Figure 6.16(c) Work required 1M method, alone and with SIP, versus work. Hie work is measured in the number of Tchebychef iterations; the number of iterations wit] is multiplied by a factor of 12/7- Notice that the entire spectrum of A is transformed into a spectrum closely bunched about \ = 1. The condition of the system is greatly improved. The convergence factors and rates of convergence of the two systems are given in Table 6.5« • Table 6.5 Initial ■d c 1 Optimal *& c 1 Convergence Factor Rate of Convergence ICYCLE TCHEB ko 201.61 198.11 .998^ .00070 20 TCHEB +SIP 1 .7072 .6509 .78I+O .IO56 20 In a problem such as this, as much is known about the spectrum of (A+B(a))~ A as is known about the spectrum of the matrix A. The initial choice of parameters, as well as the computed optimal choice of parameters, is given in Table 6.5. In both cases the adaptive procedure was able to recover from poor initial parameters and produce good estimates of the optimal parameters. The choice of a = -5 was quite arbitrary. With the value a = 1.0, the matrix (A+B(a))~ A had an eigenvalue with negative real part, The improvement in the condition of the system with a = .5 is significant enough to warrant further study into the role of a in this factorization. ll+2 6.k Summary The adaptive Tchebchef algorithm has qualities that promote its use on nonsymmetric systems whose eigenvalues lie in the right half plane. The method does not depend upon any special structure of the matrix. It can be used on finite element matrices and difference matrices, as well as in conjunction with factorization methods. The Tchebychef algorithm requires less than half of the work per iterative step that is required by the two competing methods: T Bidiagonalization and Conjugate Gradients on A A. The additional storage required by the adaptive procedure of the Tchebychef algorithm becomes less of a factor as the number of diagonals needed to store the matrix A increases. In the tests that were run, the Tchebychef algorithm was considerably faster than the competing methods. This is due, in part, to the fact that it is sensitive to the condition of A, whereas the T competing methods are sensitive to the condition of A A. This advantage was especially apparent on nearly symmetric systems. Another quality of the Tchebychef method is stability. The Tchebychef algorithm makes steady, predictable progress toward the solution. At each step the error is multiplied by a factor that is no greater than the convergence factor. This factor is determined by the choice of parameters d and c2 and the spectrum of the matrix. Any round-off error will be multiplied by this factor on the next step. The competing methods, on the other hand, depend upon orthogonality and A-orthogonality T (A A-orthogonality) . The rate of error reduction of these methods is unpredictable at best, and error reduction may become very slow if orthogonality breaks down (Hestenes and Stiefel [12]). The adaptive procedure was shown to be able to produce go estimates of the optimal iteration parameters with virtually no a priori knowledge of the spectrum of the matrix A. Although the stability of the adaptive procedure could not be guaranteed (Section 5-1), in all of the tests the procedure behaved as expected. All of the eigenvalue estimates lay inside the true hull of the eigenvalues, and, after good parameters were found, further eigenvalue estimates lay near the center of the approximate hull as predicted. This may be due in part to the absence of nonlinear elementary divisors of large dimension in the test matrices. Further refinement of the adaptive procedure may be necessary in the presence of nonlinear elementary divisors of large dimension. In addition to finding good iteration parameters, the adaptive procedure provides information about the spectrum of the matrix that may be useful to the user. One such use is in the choice of the error bound (Section 6.1) . The adaptive Tchebychef algorithm showed great promise when used in conjunction with SIP. Although the properties of the parameter a are not understood, the significant improvement in the condition of the systems tested showed the potential of this factorization method . The Tchebychef algorithm could be used in conjunction with other factorization methods as well. Y. J. Kim [Ik] has shown that some finite element matrices can be factored in much the same way as the 5-point difference operators are factored by SIP. It has also been suggested that the Tchebychef algorithm could be used in conjunction with Symmetric Successive Over Relaxation. In each of these factorizations, the adaptive property makes the Tchebychef method attractive because little is known about the spectrum of the equivalent system. IV, LIST OF REFERENCES [1] Birkhoff, G. and S. Maclane, A Survey of Modern Algebra , MacMillan, New York, 1953- [2] Bracha, A., "A Symmetric Factorization Procedure for the Solution of Elliptic Boundary Value Problems, " Digital Computer Laboratory Reports , Rep. No. 440, University of 111., April 1971. [3] Diamond, M. A., "An Economical Algorithm for the Solution of Elliptic Difference Equations Independent of User-Supplied Parameters," Ph.D. Dissertation, Department of Computer Science, University of 111., 1972. [k] Dunford, N. and J. L. Schwartz, Linear Operators , Interscience Publishers, New York, 1958. [51 Dupont, T., R. R. Kendall, and H. H. Rachford Jr., "An Approximate Factorization Procedure for Solving Self-Adjoint Elliptic Difference Equations," SIAM J. Numer. Anal. , Vol. 5, Sept. 1968, p. 558. [6] Dupont, T., "A Factorization Procedure for the Solution of Elliptic Difference Equations," SIAM J. Numer. Anal. , Vol. 5, Dec. 1968, p. 753. [7] Engeli, M., T. H. Ginsburg, H. Rutishauser, and E. L. Stiefel, "Refined Iterative Methods for Computation of the Solution and Eigenvalues of Self-Adjoint Boundary Value Problems," Mitteilungen aus dem Institute fur Angewandte Mathematik , No . 8, 1959"! [8] Faddeev, D. K. and U. N. Faddeeva, Computational Methods of Linear Algebra, W. H. Freeman & Co., San Francisco, 1963 . ' [9] Fox, L. and I. B. Parker, Chebyshev Polynomials in Numerical Analysis , London, Oxford University Press, 1960". [10] Go lub, G. H. and R. S. Varga, "Chebyshev Semi-iterative Methods, Successive Over Relaxation Iterative Methods and Second Order Richardson Iterative Methods, " Numerische Mathematik , Vol . 3, 1961, p. 114-7. [11] Go lub, G. and W. Kahan, "Calculating the Singular Values and Pseudo- Inverse of a Matrix," SIAM J. Numer. Anal. , Vol. 2, 1965. [12] Hestenes, M. R. and E. L. Stiefel, "Methods of Conjugate Gradients for Solving Linear Systems," N.B.S. J. of Res., Vol. k-9, 1952, p. 409- 146 [13] Hille, E., Analytic Function Theory, Vol. II , Ginn & Co., Boston, 1962, Ch. 16, pp. 264-274. [14] Kim, Y. J., "An Efficient Iterative Procedure for Use with the Finite Element Method," Ph.D. Dissertation, Department of Computer Science, University of 111., UIUCDCS-R- 73-600, August 1973- [15] Kincaid, D. R., "On Complex Second-Degree Iterative Methods," Center for Numerical Analysis, University of Texas, Austin, 1968. [16] Kjellberg, C, "On the Convergence of Successive Over Relaxation Applied to a Class of Linear Systems of Equations with Complex Eigenvalues," Ericsson Technics , No. 2, 1958. [17] Paige, C. C, "Bidiagonalization of Matrices and Solution of Linear Equations," SIAM J. Numer Anal. , Vol. 11, March 1974, p. 197. [18] Shampine, L. F. and R. C. Allen Jr., Numerical Computing: An Introduction , W. B. Saunders Co., Philadelphia, 1973 • [19] Stiefel, E. L., "Kernel Polynomials in Linear Algebra and Their Applications," U.S. N.B.S. Applied Math Series , Vol. 1+9, Jan. 1958, p. 1. [20] Stone, H. L., "Iterative Solutions of Implicit Approximations of Multidimensional Partial Differential Equations, " SIAM J. Numer. Anal., Vol. 5, Sept. 1968, p. 530. [21] Varga, R. S., "A Comparison of Successive Over Relaxation and Semi- Iterative Methods Using Chebyshev Polynomials," SIAM J. Numer Anal ., Vol. 5, 1957, P. 39- [22] Varga, R. S., Matrix Iterative Analysis , Prentice-Hall, Englewood Cliffs, New Jersey, 1962. [23] Walsh, J. L., Interpolation and Approximation by Rational Functions in the Complex Domain , Revised Edition, Colloquium Publications, Vol. 20, A. M.S., Providence, R.I., 1956. [24] Wilkinson, J. H., The Algebraic Eigenvalue Problem , Clarendon Press, Oxford, 1965. [25] Wrigley, H. E., "Accelerating the Jacobi Method for Solving Simultaneous Equations by Chebyshev Extrapolation when the Eigenvalues of the Iteration Matrix are Complex," Computer Journal , Vol. 6, July 1963, p. 169. [26] Young, D., Iterative Solution of Large Linear Systems , Academic Press, New York and London, 1971 [27] Zienkiewicz, 0. C, The Finite Element Method in Engineering Science , McGraw-Hill, London, New York, 1971. 1.1*7 APPENDIX A 148 C SUBRGUTTNE TCHE B < A, X , B, R , DX, XLA ST f P LA ST t S tCH , C 1 ERBND, ICYCLE, IMAX) C THIS SUBROUTINE SOLVES THE SYSTEM AX=B, RETURNING THE * C SOtUTICN, X. ^HE PCSI T IVE HULL CF tj-e EIGENVALUES OF A IS * C DETERMINED DYNAMICALLY AND RETURNED IN CH. * c * C * C THE USE* MUST SUPPLY THE SLBPCLTINE NSYMAX (Y ,X , A ) , * C WHICH PREFORMS THE MATRIX VECTG D MULT I PLL IC ATIGN : * C Y - AX. * C * C THE INPUT PARAMETERS APE: * C * C A(N,NC) THE MATRIX * C X(N) THE INITIAL GEUSS * C B(N) THE TARGET VECTOR * C * C EPENC THE ACCEPTABLE EFROR * C ICYCLE THE NUNBER CF ITERATIVE STEPS EETWEEN * C ADAPTIVE PPCCEDURES * C IT^AX THE MAXIMUM NUMBER OF IThRATINS ALLOWED * C * C CCMMON /APEAl/ N,NC : * C N T HE DIMENSION * C NC THE NUMBER CF COLUMNS NEECEC TC STCRE * C THE MATRIX A(N,NC) * C * C CCMMON /APEA2/ D,C2,A2,FC : * C D THE CENTER CF THE ELLIPSE * C C2 THE FCCAl LENGTH SCUAPED * C * c * C * C OTFER VARIABLES A«E: * C * C P (N) THE PESIDUAL * C DXCNJ THE CHANGE IN X AT EACH STEP * C XLAST(N) X AT THE START OF THE CYCLE * C RLAST(N) P AT THE START OF THE CYCLE * C S,2) LINK LISTING OF CH * C * C A2 PEAL AXIS OF THE BEST ELLIPSE * C CF CCNVERGENCE FACTOR OF THE * c c c c c C c c c c c c c c c c c c c c RQ P 1 n 2 PEST ELL IPSf FEMDU: NCPM C* RfcSICUAL RESIOU AT THk STAPT OF THE CYCLt IltPATlUN FAPAKETtPS * * * ft * * SURCCU T INE TCFFB(A,X,R,P,DX,XLAST,rLAST,S,CH, 1 EFBNC, ICYCLE, I"AX ) TMPIICI T REAL*fi( A-HfC-Z ) fC^CN /A9EAL/ N,ND COKNCN /AREA?/ D,C2,A2,FC COMMON /AP6A3/ I CM , I F I o ST , I F o EE CINEf^ TCN A(N l KOI l X(N)fE(M|ft(N) l DX(Nh 1 XL AST(N) ,PLAST(N) ,S(N,4) DI^ENSICN C"(25,2) , ICM 25, 2) BEGIN INITIAL IZE Cl-t ICI-.lFlPSTf IFPEE,A2 t FC CALL TNTT(CH) INITIALIZE ISTEP !STE° = CALCULATE ~FE INITIAL RESIDUAL 6 CALL N^YMAX( C ,X ,A) DO 8 1=1, N MI) = B(I) - R(I) C CCNTTNUF OUTPUT THE INITIAL PESIDU CALL r.L"fPL T (P,PSD,IS"> LP) SAVE TNI~IAL X ANC R PC = PSO CALL LASTX( X,P , PSO, > LAST, PL AST, PC) BEGIN STTEFfcl ITEFATICN TNITTALTZF OX AND P AP AMtT C r - S PI AND P2 LC DC 15 I = 1 ,N DX(I ) = "(I ) /D lb CCNTTNUE D l = 2.D0/D F2 = O.CO 150 c C *AI N LCCP 20 DO 40 I = l.ICYCLE C C STCRE LAST FCUP PESIOUALS IN S IFd.GT.ICYCLE-4) GC TO 25 GO TO 3C C THEN STCRE R 25 IC0L = ICYCLE-H-1 CO 26 K=l,N S(K f ICCL)=P(K» 26 CONTINUE C C CC^PUTE NEW X 30 00 32 J=1,N XCJ) = XU ) ♦ 0X(J ) 32 CONTINUE C C CGPFUTE NEW P CALL NSYMAX(P,X,A) DO 34 J=1,N R(J) = B(J ) - F(J ) 34 CONTINUE C C COMPUTE NEW PARANETEPS P2 = C2*P1/(4.D0*D - C2*Pl) PI ^= (l.CO ♦ P2)/D C C COMPUTE NEW OX OC 36 J=1,N CX( J) ■ P1*R(J ) ♦ P2*CX(J) 36 CONTINUE C C UPDATE ISTEP C C OUTPUT RESIDU ISTEP = ISTEP+1 : RESIDU CALL OUTPUT(P,R SO, ISTEP) 40 CCNTJNUE C C END NAIN LCCF C C END STIEFEL C C TEST ^C HALT TF(PSD.LT.EPBNQ) GO TO <30 IF( I r TEF.GE.IMAX) GC TO <3C c c f. c c c c c <*b 5C pl-GIN ADAPTIVE FRi.CECLPE CALL ADA°tE ITERATION PARAMETERS C ANO C2, * C T HE EXPECTED CONVERGENCE FACTCP, FC, ANC TFE NCPM OF P. * Q ********#*********************************»>:*****************#** IHPLTCT" REAL*8( A-H t C-2l CC^CN /APEAl/ N,ND CCHNCN /ARFA2/ C,C2,A2 t FC DINFNSJCN 2=PLAST(T) CONTTNUE PSC=PO GC tc 3 UPDATE XLAST AND PLAST PO = RSC DC 25 1=1, N XLASTU )=X(U RL AST(I )=P( I) CONTINUE ***** S * * ***** c c c c c C SUBPCUTINE INFPC(X,Y, FPCCN) ********************************************************** T HTS S'JBPOUTINE FINDS T HE INNEF PRODUCT CF THE TWO * VECTCP^: PRCD = * ********************************************************** IMPLICIT REAL*8( A-H,C-Z) DIMENSICN X(N) ,Y(N) BEGIN FRCD = O.CO DO 10 1=1, N PPCD = 1C CCNTINLE RETUPN ENC PPOO + X( I)*Y( I ) SlRPCUriNE TMT~E VALUES D*Ci C-C ANC TEE LLLIPM IS * C THE LINE BETWEEN THEM. * INFLICT"' PEAL*8 ( A-H,C-Z) r.OKMCN /AREA2/ C,C2,A2,FC Cf/VPCN /APEA3/ ICF, IFIPST, IFREE OIMENSICN CH<25,2) ,ICH(25,2) BECIN C c c c c c INITIALIZE TCH DO 5 1=1,25 IfH(I, I) 5 TCH(T ,2) I-L H-i 10 20 INITIALIZE CF,TFIPST, IFPEE, A2tFC IFIRST = 1 !F(C2.LE.0.0DC) GC TQ 10 ^C TC 20 THEN IF°EE = 2 ICM1 ,1) = 1 !CH(l,2) = 1 CHI 1,1) = D CHIi.2) = CSQRK-C2) A2 = C.COC CC TO 30 ELSE (C2.GT.0) IFPEE = 3 TCF(l t l) = TCH (I,?) = ICH(2, 1) - ICH(2,2) = CH(1,1) = CH( 1,2) = CF(2,1 ) = CH(2,2) = A2 = C2 1 2 1 2 0-DSCPTCC2) 0.000 D40SCRT (C2) C.ODC 30 FC = C.COC PETUPN END ±5k SL'BRCL C**#****** ** C THIS S C BASED ON T C VALUES, CE C MA T PEX S. £*********** IMPLTC CCMKIN COMMON DTMENS OIMEN^ C C C C C C c c c c c c c c TINE ADAPT(S,R ,CH, ****************** UBPOUTINE FINDS Th HE CCNVEX HULL CF NEGATED FROM THE R ****************** TT REAL*8( A-H,0-Z) /AREA1/ N,ND /APEA2/ C,C2,A2,F ICN S(N,4) ,R(N) TCN CH(25 y 2)tQ<4), IFLAG) ******************************** E BEST ITERATION PARAMETERS * A SET CF APPRCXIMATE EIGEN- * ESIDUAL VECTORS STORED IN THE * ******************************** 10 FINC THE LEAST SQUARE SO CALL LSTSC(S,C,R) EV<4,2) LUTION TO SC*R=C FINC T FE EIGENVALUE APPROXIMATIONS CALL EVS(C,EV,IRT) IF NO NEW EIGENVALUES RESUME STIEFEL ITERATION IF( IPT.EC.O) GO TC 5 GC TO IG THEN IFLAG = GC TO 2C ACC THE NEW APPPCXIMATIC AND FORM THE NEW CONVEX CALL HULL DIPENSTCN EV ( A, 2) , Q ( 4 ) BEGIN C c c c c c c c c c c c c c FINC PCCTS ***ANY LIBRARY ROUTINE MAY BE USED TO FIND THE PCCTS. ***7HE PCCTS SHOULD BE STUPED IN EV<4,2) FROM THE FRONT. ***FV(I,1) IS THE FEAL PART AND EV(I,2) THE INAGINAFY PART ***TFE VARIABLE I»T INDICATES THE NUMER CF &CCTS FCUND. *** P C SR APPEARS IN APPENDIX C. 10 AA( 1, 1) = l.CDO DC 10 T=l ,4 AA(H-1,1) = CONTINUE IOEG = 4 C( I ) 14 16 CALL RSSR(AA,P00TS»IDEG,1C, 15, 1. CD-4, 1. CD-6 ,DDI IPT = 4-IDEG CO 14 T=1,IPT FV( 1,1 ) = RCCTS(l,5-i ) EV( I ,2) = POCTS(2,5-I) CCN T INUE TRANSFORM THE ROOTS TO EIGENVALUES CF A G = D+DSCPT(C*D-C2) K=IRT H = CC 20 T=i,K T 1 = EV(I ,l)*EV(I,i)+EV(I,2)*EV(I,2» IF(Tl.LT.CABS(C2)/(G*G) ) GO TC 16 GC tc 13 THEN DISCARD RCCT IRT=IFT-! GC TC 20 EL C E FIND EIGENVALUE ESTIMATE c c le 20 22 24 30 ICC 200 30C 40 M =T I ♦ l FVUI,l) = C-.5CO*EVU,l)*(C2/(Tl*<")*G) EV(II ,2I»-.5CC*EV( I,2)*(C2/(Ti*G)-G) CONTINUE OITPLT NFU FIGENVALUE ESTIMATE^ TF< IFT.FC.O) CO TO 22 GC TC 24 THEN WTTE(6, 300) GC T C 40 El TF( GC TO 65 !CH(K ,2) ICF(K, 1) TEST. LE.C. CDC) GC ~0 7C THEN LNLINK CH(K f -J ICH(TCH( K,IJ,2) = TCH(TCH(K,2), 1) = TCH( K f 2) = IFREE TF^EE = K K = !CH(K,1 ) !F(ICH(K, D.EC.K) K = ICH(K,2) HC '0 60 EL^F *CVE tc NEXT K K= JCHlK f 2) GO to 60 9C PE^L'SN END l6o SUBPCUTINE ELLIP(CH) C THIS SUBROUTINE FINOS THE OPTIMAL ITERATION PARAMETERS * C FOP THE FCSITTVE HULL CH. IT RETAINS ONLY The KEY ELEMENTS * C IN THE HLLL AND OLTPLTS THESE, * C++****** 4 **+ 4+ 4*** *+*4+**X*+1 +*++*+++++*+++*+++++++++++++++++ IMPLICIT REAL*8< A-H,C-Z) COMMON /AREA2/ 0,C2 f A2,FC CCM^CN /AREA3/ ICF, I F IRST , IFREE DI ME f^SI CN CH (25,2) ,ICH( 25,2 ) ,ICCL( 25) C C EEGIN C C FIRST SEE TF ANY PAIPfclSE BEST IS THE CPTIHUP J = TFI&ST 10 IF( ICH( J, 2) .EC. J ) GC TO 30 K = J 15 XFUCMKt21.EQ.KJ GO TO 25 K = ICH(K,2) CALL TWOPT(CH,J,K,IFLAG) IF(IFLAG.EO.l) GO TO 15 C TFFN TWOPT FAILEC C C TEST TO SEE IF THIS PAIPWISE C BEST IS CPTIML. I = IFIPST C It IFCI.EC.J .CP. I.EC.K) GL TO 18 GC TC 2C C tj-EN SKIP TEST 16 GC TC 22 C 2 TEST=(C-CH( 1,1) )* IF(TEST.GT.l.0C-8 ) GC TC 51 GC TC 52 T EEN TES T FAILS GC TC 45 IF( ICH( I, 2) .EO. I ) GO TO 53 GC TC 58 THEN TEST WCRKS ICOL(J) = I ICCL(K) = 1 162 ICCL(L) = i IF(FC.LT.TFC) GCTO 54 GOTO 56 C THEN THIS POINT BEST SO FAR 54 TFC = FC TC = D TC2 = C2 TA2 = A2 56 GC TO 45 G C UPDATE I 58 I = ICF(I,2) GC TC 46 C END OF TEST C 6C K = TCH(K,2) GC TO 40 65 J = ICH(J,2) GO TO 35 C ENC OF SEA D CF C C T HE BEST THREE VALUE FCINT FIT HAS BEEN FCUNC 70 C = T C C? = TC2 A2 = TA2 FC = TFC C C SAVE ONLY KEY ELEMENTS I = IFIPST 8C K = 7CHU ,2 ) IF( ICGLd I. ECO) GO TO 82 GO TO 84 C THEN LNLINK CH(I,-) 82 !CH( TCH( 1,11 ,2) " ICH(I,2) ICHI ICHI I,2)tl) = ICFU, 1) ICH(I ,2) = IFPEE TF°EE = I 84 IFCK.EC.T ) GC TC SO T = K GC TO 80 C C OL^PLT NEW PARAMETERS ANC POSITIVE HULL 90 WOTTE(6,600) WFTTE(6,7C0) D,C2»A2,FC VPT^Et 6,8C0) I = IFIRST 95 WPTTE(6,900) CH ( I , I ) , CH ( I ,2 ) IF( ICH(I ,2). EC. I ) GC TO 99 c c T = ICH(I GC to 93 2) 500 60C 700 80C 900 FORMAT^ : FCKA'C FCPVAT( FC^MAK 012. 5, 3X, Ft)»MAT( t THF THE THE C = BtST . • ) 3X,'C2 = ■ ,012.5, 3>, 'Ai = • CPT IML IS A PAIFWlSfc NEW PAPA^FTEFS ARE : • ) •,D12.5, 'FC = ', C12.5) POSITIVE HILL:* ) FC»M T C •, 3X, 012.5, ' ♦I*',D12.5) 9S P F " I P N ENC 164 SUBROUTINE TWOPT(CH, J,K,TFLAG) C THIS SLBPCUTINE FINDS THE OPTIMUM ITERATION PARAMETERS * C FOP THE TWO EIGENVALUES: CH(J,-) ANC CH(K,-) * IMPLICIT REAL*8< A-H,0-Z> COMMON /APEA2/ C,C2tA2,FC CTKEf^SICN CM25,2) C C C C C c c c c SET THE CONSTANTS /* = (CH(Ktl)-CH(J, 1 ))/2.0CC B = (CH -CH ( L , 1 ) ) 1 *CF(L, 1)*CH(K, 1I*(CE(L ,1I-CH(K, II) 2 +CH< J, I )*CH2I+CSQCTU2-C2i ) / ( D+OCQPT 1=X*DSCRT(X*X+Z*Z*Z) Y2=X-0S0P"'(X*X*Z*Z*Z ) Y=DSTGMOABS (Yl )**(l .C0/3.C0 ),Yl ) + CS IGN( CABS < Y2 ) ** (l.DO/1. DC) ,Y2) A2=Y-M8*e*A*A )/( B*B*T C2=(A2*( A2-T*T-A*A|)/ FC = (OSOPT( A2H-DSCR *T) (A2-A*A) T(A2-C2))/(C+CSCFT(C*C-C2)) OE T LRN END SUBCClTJNfc F T f- T MA f H,S,T f IF LAG J THI<: SUePOJ T TNE FINDS T HE OPTIMAL !TL»A1ION PA°AMFTtRS * FOP TWC CC^PLEX EIGENVALUES. The EEST VALLF UF L IS FOUND * AS T HE PCO T OF A FIFTH OEGREE FCLYNCMlAL MTF COEFICrENTS * PliPE FTFTF DECREE PCLYNCMIAL "1 = -A*7/<: 02 = -A*S/T cj = $*~/A *4 = -9 Pl*fil*<»l« , 2.D0*F2*P3-4.D0*R4J+P2*IP2+R3-4.0C*P41 1 + P4*(4.D0*r.4-2.C0*P3) P2 *P 1 *< 4. DO *£4* , Alt PI Al = DSCFTUl) C C TEST Al IFUULT.EBNC .OF, ISTEP. EC. 4) GO TO 32 GO TO 38 C THEN SOLUTICN IS X=X-G*VW 32 G = B1*Z/(B1*W-T) OC 34 1*1 f > X( I ) = X(I ) - G*VMI1 34 CCNTINUE GO TU ICC C EL C E RE-ORTHOGONALIZE 38 K = ISTFP+1 DC 40 I «i,M V( I,K ) = V( J,K)/A1 40 CCNTINUE DO 42 1*1, ISTEP CALL TNPR0(V(1,K),V( 1,I),P(I),M) 42 CCNTINUE DC 44 1*1, ISTEP DC 44 J=1,M V(J,K)=V< J,K)-F(I )*V , '*»M ************* ********** ******* TMFLTCI 7 0FAL*8 ( A-H,C-Z ) DIMENSION AC N »M) ,X( N 1 1 VI HI 10 20 BEGIN DC CCN RETUPN ENC 20 K = l,M Y(K) = 0.00 UL 10 1=1»N Y(K) = Y(K) ♦ A(I ,K)*X( I) CONTINUE T I MJ F I7h APPENDIX C MlflFOU T INF FSSP (A, " GOT S , DEGF EF , M t P b AX , CI l_T A , F PS I L ,C) DIMENSION A(5LI, T A( 5 L , 3 ) , P OHT S ( ^ , 50 J , ( 5 I) ,P C.MUD ( 5 C ) , IN0NCT(25I ,MNONP~ (25) PEAI *8 A ,POOT< t D,FOMOD,DELTA,FFSlL IN T FGEF CFGPEE NCUP -DEGREE ' NL=NCUR CALL POG-TSQ I A v lAtNCUPfM) CALL FEAL^T ( A f I A , ¥ , NCUP ,DFLT A , E°S I L ,RCMCD,IA, 1N0NPT, MNON° T , NCO,RCCTS) TF (NCG.EO.O) GO T 12 CALL CCMPR T (A, IA,PC*CD t PCCTS t*SMNCNFT f NCNF"" f 1 NCU, DELTA, FPML IF (NCUP .EQ.O ) CO *0 12 TF (NCLP.EO.NL) M = * + i TF (M.LE.MMAX) GO TC 7 GO T 13 12 CALL FECCN (ROOTS, A, 13 DE^EF=NCUP PE T U D N END ,NCUP) CCECFEE) SUBFOU'-INE ROCTSC I At I A t NCUP ♦ MM) DIMENSION A( 51,3) , IA( 51, 3) FFAL*P A,X Nl = NCUR + l 00 1 J=i,Nl A( J,3)=A(J,1 ) IA( J, 3 » = C CALL SCAL (A( J, 3), IA( J, 3 ) ) CONTINUE DO 9 M=l t MM DO 2 J=1,N1 A( J, 2) =A( J, 3) IA( J,2)=TA( J, 3) A( J,3)=0.0 CONTt N ue DO 6 J=1,N1 KM= MTNO (N1-J,J-1) IF (KM.EO.C) GO TO 5 DO 4 L=L,KM JL=J-L Jl.P = J + L X=A( JL,2)*A(JLP,2 ) LR= MOD (L,2) IF (i.P.FQ.i) X=-X IX = IA( JL,2 )+IA(JLP,2) CALL ADD (A( J, 3) ,IA( J, 3) ,X,IX) CONTINUE 176 A( J,3)=2.0*A(J,3) X=A( J,2)**2 IX=IA( J, 2)+I A(J,2) CALL ADC (A( J,3 )t IA(J,3),X, IX) JR = MOD *W 20 NCUP=NCUR-l GO ^C 21 21 W=-W IF (W.GT.O.O) GO TC 14 NC0 = NCOl NONFT(NCO) =KL MN0NPT(NC01=I5*l-J 22 CONTINUE PE T UPN END »K) 178 SUeFOUTINE COMPPT ( A, I A, ROMOD, ROOTS, M, MNCNRT,NONRT , i NCO, DELTA, EPSIL ,NC'JP) DIMENSION A(5 1), IA( 51 ) ,POMOO( 50 ) , ROOTS ( 2 , 5C) ,SR < 48, 3) , IS* (4 8 ,3 ) , 1SP0M0D<47) ,SRC0TS<2,47),MNCNPT(25),N0NPT<25),NS0NFT(23),MSNGPT(23! PEAL*8 A,P0MCD,R0CT3,SP,SPCMCD,SRCCTS, W, C , DELTA , EPS IL DO 23 1=1, NCO JA=NCNRT(I) U=MN0NPT(T)/2 DO 11 J=1,I1 CALL SLBRES I A f IA *KCUR ,SR , ISRf PCMOD < JA ) I NSCUR=NCUR-3+4/NCUR J2 = NSCUR IF (NSCUR.NE. I) GC ^C 9 SROOTSUt 1) = -SR(2,1 )/SP< i, i) NSCUR=0 GO T il 9 CALl PCOTSO (SR,ISP,N^CUR,MJ CALL PEALRT ( SP , ISP , ^, NS CUP ,DELT A, EPS IL ,SPCMOD, ISR, lNSONRT,MSNORT,NSCC,SPrC T S) 11 LL=J2-NSCUR L=l 12 L2=J2*1-L IF (LL.EC.O) SRLCTS(1,L2)=0.0 !F(DABS( SROO T S< L,L2)) .GE.2.0) GC TU 16 W=SF00TS(1,L2 )*RQMCC(JA) Q =FCMCD( JA)*FGPLC( JA) CALL TEST = 0.0 T <2)=0.0 Nl=N*l E(l ) = PCMCD+2.0*EP$TL*PCMOD E(2)=P0PCD«-EP$IL*FC*Cn DO 7 I'1,N1 UtQt NfPDKCCbCSIL ,K) Bl 3) , 7(2 ) ,F.U) E, DIE, 0, R0MCU,T,f PSIL 18 T(l)»T(i)*E(l HOAe$(MI) ) T ( 2)«T(2)*E( 2)+DA8$( A(I ) ) CON T INUE DIF=T(l)-T<2) K = IF (O.NE.0.0) GO T 18 TF(DIF.GE.DABS(8<3))> K=l PETUPN !F (DIF**2.GE . ( 0*8 ( 2 )**2«-W*B ( 2 J *B( 3 ) +B ( 2)**2) ) PF""tjRN END K=l i8o SUBROUTINE SUBRES ( A ,1 A, N, SR ,1 SR , RCMCD) CIMENSTON A<51), IM51), SRU8), ISR(A8), C( 51 I ,B( 48, 3) REAL*8 A,SR,POCD, T ,C, E Nl»N+l T=1.0 DO 1 I=l,Nl J=N1-I*1 CU)=A(J) *T T=T*ROMOD 1 CONTTNUE IF (N.LE.4) GC TC 16 N2=N-2 CO 3 1=1, N2 BC I ,11=0.0 3 CONTINUE B(1,2)=C(1 ) DO 8 1=2, N2 B( l,3) = C(I >-B(l,l ) DO 5 J=2,I B( J, 3)=-B( J-l ,2)-6( J, I) 5 CONTINUE DO 7 J=1,I IF (J.NE.I) B( J,L)=B( J, 2) B( J,2)=B(J,3 ) 7 CONTINUF 8 CONTINUE $s(N2> =C(N)-B(1,3) S«MN2-ll =-C =-C(2) IF (N.EQ.2) RETURN S*(3) =C(3)-C< 1)-C<5) PETURN END 5UR UITINE RECCN N>AU-2)*Q A(T )=A(I )-XN CONTINUE ^ETU^N ENC 182 11 12 SUBROUTINE ACC EL E^R t U ) IC=0 ACB5=0ABS(B-C) A=C FA=F(A> FB=F(B) FC=FA K0UNT=2 FX=HMAX1 (CABS (Fe) ,rAB^ (FC) ) T F(OABM FC) .GF.DABS(FB) ) GU '0 2 A=B FA=FB B = C FR-FC C = A FC=FA CM3=0.500*l Subt 1! Ic AN ITKRATIVE METHOD FOR SOLVING NONSYMMETRTC LIN. WITH DYNAMIC ESTIMATION OF PARAMETERS 3. K 1 c ipicnt 1 %c< 5. Repon Rate October ! Whoil>. ^ Thomas Albert Manteuffel 8. Perl 01 in ina Organization Ri p No. Perlorming Organization N.«nu- and Address Department of Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois 6l801 10. Proj« 1 ' l aal fori 1 n 1 1. ( 0r1tr.11 1 ( .r.uii NSF GJ-36393 DCR71+-23679 (N3F) 2. Sponsoring Organization Name and Address National Science Foundation Washington, D.C 13. I ) I" "I I' (purl \- P< ( overed 14. topple me ntary Notes 1. Abstracts The subject of this thesis is an iterative method for solving large, sparse linear systems, based upon the scaled and translated Tchebychef polynomials. It is shown that such an iteration is optimal, in a certain sense, over all polynomial based gradient methods. Further, an algorithm is developed by which the best scaling and translating factors may be found from knowledge of the eigenvalues of the linear system. Since the eigenvalues of the linear system are seldom known, a dynamic procedure is developed to improve the choice of scaling and translating parameters from knowledge obtained during iteration. Finally, experimental results are described which show this method to be of potential value on a large class of problems, especially in conjunction with factorization methods. . Kc> Uords and Document Analysis. 17a. Descriptors Linear Iterative Nonsymmetric Dynamic Tchebychef b. Identifiers Open-Ended Terms :• COSATI Field/Group Availability Statement 19. Security Class (This Report) UNCLASSIFIED 20. Security Class (This Page UNCLASSIFIED 21. No. of Pages 22. Price '.*M NTIS-35 ( 10-70) USCOMM-DC 40329-P71 *..< *SN t7# 0GT1219?/