LIBRARY OF THE UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN 510.84 TfiGr no.770-775 cop. 2~ 1 he person charging this material is re- sponsible for its return to the library from which it was withdrawn on or before the Latest Date stamped below. Theft, mutilation, and underlining of books are reasons for disciplinary action and may result in dismissal from the University. UNIVERSITY OF ILLINOIS LIBRARY AT URBANA-CHAMPAIGN L161 — 0-10% Digitized by the Internet Archive in 2013 http://archive.org/details/onnewtoniterativ772sher UIUCDCS-R-75-772 /ju cq coo-2383-0027 On Newton-Iterative Methods for the Solution of Systems of Nonlinear Equations *y Andrew H. Sherman December 1975 UIUCDCS-R-75-772 COO-2383-0027 On Newton— Iterative Methods for the Solution of Systems of Nonlinear Equations by Andrew H. Sherman* December 197? Department of Computer Science University of Illinois at Urban a- Champaign Urbana, Illinois 6l801 * This work supported in part by the Office of Naval Research under grant N-00lU-67-A-0097-00l6 and by the U. S. Energy Research and Development Administration under grant US AEC AT(ll-l) 2383. 1. Introduction N N Let R be a subset of IR and let F: R -*■ E be a nonlinear function. * We wish to consider computing a solution x of the system of nonlinear equations F(x) = 0, (1.1) vhere the Jacobian matrix F'(.x)=f (x) is sparse, symmetric, and positive definite. Although our results apply in general, nonlinear systems arising in the numerical solution of partial differential equations are of particular interest. A very important method for the solution of (l.l) is Newton's method (cf. Ortega and Rheinboldt [ll]). For a given initial guess x , - . + + (1) (2) (3) we generate a sequence of iterates x , x , x , ... by j^) . ;W . jW, (1.2) (k) where 6 is the solution to the sparse linear system F'(x (k) ) 6 (k) = F(x (k) ). (1.3) Under suitable assumptions on x and F, the sequence is well defined and converges quadratically to x*, i.e. , m «r i , ., . II (k+1) *i i 2 k , , (k) »nt tt 3c < 1 such that [|x - x £ c ||x -x||. A straightforward implementation of Newton's method could use sparse symmetric Gaussian elimination (cf. Eisenstat, Schultz, and Sherman t Throughout this paper we use | • | to denote the Euclidean norm. ++ In the notation of Ortega and Rheinboldt [ll] , this condition defines R-quadratic convergence. This is slightly weaker than the more standard definition of Q-quadratic convergence, i.e., 3c' < 1 such that x - x ^ c' x - x 2 [8], Sherman [13]) to solve the linear systems which arise. Unfortunately , this would "be inefficient computationally because at each step it would (k) (k) require the computation of F(x ) and F'(x ) and the exact solution of (1.3). Also, a large amount of storage would be required for the (k) factorization of F' (x ). A number of modifications have been proposed to reduce some of the costs of Newton's method, notably the simplified Newton's or chord method and a variety of Newton-iterative methods (cf. Ortega and Rheinboldt [ll]). For certain problems, these methods may require less storage or work than Newton's method. However, computational difficulties are typical in the solution of nonlinear systems, and each of these methods has its disadvan- tages. The choice of method for a specific nonlinear system (l.l) depends quite heavily on the properties of F and on the storage and time constraints imposed by the computational environment. The simplified Newton's method replaces F' (x ) with F'(x ) in (l.3). This avoids the computation of F' (x ) at each step and reduces the cost of solving (1.3) for k 5 1, since, if F'(x ) is factored into T the product U DU at the first step, each succeeding step only requires forward and backsolving three triangular systems. However, the chord t method converges only linearly and requires the same amount of storage as Newton's method. The Newton— iterative methods use a linear iterative method (such as SOR, cf. Ortega and Rheinboldt [ll], p. 215) to approximate the solution of (1.3) instead of solving it exactly. The sequence of iterates generated by (k) * t The iterates x converge linearly to x if u 4.x. 11 Ck+1) *, 1 I, (k) *, 3c < 1 such that x - x £ c x - x 3 such methods depends upon the particular iterative method chosen and the criteria used to stop the inner iteration. The advantages of Newton- iterative methods are that they require only a small amount of storage and (k) * that, when x is near x , they can take advantage of the fact that zero (k) is a good initial approximation to 6 " . The problem with Newton-iterative methods is that their rate of convergence is essentially determined "by the rate of convergence of the inner (linear) iterative method applied to the linearized problem (1.3). Thus they encounter all of the usual difficulties with linear iterative methods relating to parameter estimation, starting guesses, and stopping criteria. In this paper we present the Newton-Richardson methods (cf. Eisenstat, Schultz, and Sherman [7], Sherman [13]), a class of Newton- iterative methods which use a Richardson-D' Jakonov iteration (cf. Richardson [12], D'Jakonov [h]) to approximate the solution of (1.3) at each step. The Newton-Richardson methods require the same amount of storage as Newton's method or the chord method, but if 2 steps of the inner iteration are taken at the k-th step, they will converge quadratically with less work per step (on average) than with Newton's method. And as compared to most other Newton-iterative methods, the Newton-Richardson methods have the advantage that their rate of convergence is essentially independent of N. In the next section we describe a model semilinear partial differential equation which will be used to motivate the Newton-Richardson methods. In Section 3 we discuss the implementation of Newton's method and show how to derive Newton-Richardson methods for the model problem. In Section U, we examine the general theory of Newton-iterative methods and state several local convergence results. In Section 5 we consider the k ramifications of the theory for the model problem, and finally, in the appendix we prove the results stated in Section h. 2. A Model Problem In this section we describe a model problem arising in the numerical solution of semilinear partial differential equations and show how to apply Newton's method to it. In the next section, we will motivate the Newton- Richardson methods by analyzing the costs of Newton's method and certain of its modifications for this model problem. Let D be the unit square (0,l) x (0,l) and let 3D denote the boundary of D. The model problem we wish to solve is the nonlinear Poisson equation -A v = g(v) in D (2.1) with v = on 3D, where g satisfies g v (v) £ X < A (where A is the fundamental eigenvalue of -A on D) , so that v(x,y) exists and is unique (cf. Ortega and Rheinboldt [ll], pp. 110, ll6). To obtain an approximate solution to (2.1) , the continuous problem must be reduced to a discrete form. Several techniques are available to do this, and we consider the use of five-point finite difference approximations. We superimpose on D a uniform square n*n mesh D , with boundary 3D , and replace the differential operator -A in (2.1) with the standard five-point difference approximation on D, . (Here h = — r* is the vertical or horizontal ^ h n+1 distance between adjacent mesh points.) Letting V. . be the approximation to v(ih, jh) , we obtain the system of equations: -V. . . - V. ■ . - V. . _ - V._,_- , + UV. , = h 2 g(V. .) l-l, j i,J-l i,,] + l i+l, J ij ij for (ih, jh) e D h (2.2) V = for (ih, jh) e 3D h . 2 Letting N = n and ordering the mesh points with the natural, or row-by-row (2.3) ordering, we may write these equations more concisely as an N*N system of nonlinear equations AV = G(V) , in which V = {V ij: (ih, jh) e D h }, G(V) = {h g(V ): (ih, jh) e ly, and A is the n*n block tridiagonal matrix A = -I X B where I is the nxn identity matrix and B is the n> B = -1 N -l H Under the conditions imposed on (2.1), this system of equations has a unique solution V which is a close approximation to v(ih, jh) (cf. Ortega and Rheinboldt [ll], p. 112). Newton's method may be applied in a straightforward way to solve (2.3) for V. Given an initial guess V , we generate a sequence of successive approximations V , V , V , ... to V by v (k+l) _ y (k) _ 6 (k) (k) . where 6 is the solution of the system of linear equations 7 j(y (k) ) 6 (k) = AV (k) - G(V (k) ) (2.1+) and j(y (k) ) = a - ~ (v (k) ). (k) Under the assumptions made above, J(V ) is an N*N sparse, symmetric, positive definite matrix having the same zero structure as A (i.e. , (j(V (k) )). . 4 if and only if A. . i 0). If V^ is sufficiently close to V, then Newton's method converges quadratically to the solution of (2.3) independent of h (cf. Greenspan and Parter [10]), and only 0(log log N) Newton iterations are required to reduce the initial error | [ V - v| | 2 1 hy a factor of h = — , the discretization error. t Throughout this paper we use log(x) to denote log_(x) 8 3. Newton-Richardson Methods for the Model Problem As we saw in the last section, each step of Newton's method for the model problem requires the solution of a sparse symmetric linear system. Thus Newton's method really amounts to the solution of a sequence of related linear systems, and its costs are directly related to the manner in which these systems are solved. In this section we consider a number of approaches to solving the systems (2.U), and we introduce Newton-Richardson methods for the model problem (cf. Eisenstat, Schultz , and Sherman [7], Sherman [13]). In the standard implementation of Newton's method for the model problem, the linear systems which arise can be solved with sparse symmetric Gaussian elimination (cf. Eisenstat, Schultz, and Sherman [7, 8], Sherman [13]). Each solution requires 0(N log N) storage locations for the factors r (kh rt ,„3/2, ..„.__., ,,.„_ „__ . T „ T (k) o f J(V V ; ), 0(N ) arithmetic operations to factor J(V ; ), and 0(N log N) arithmetic operations to solve the resulting triangular systems (cf. Sherman [13]). For V sufficiently close to V, Newton's method converges quad- ratically, and the initial error | | V - v| can be reduced by a factor of 2 1 h = — in O(log log N) steps at a total cost of 0(N log N) storage locations 3/2 and 0(N log log N) arithmetic operations. One of the reasons that Newton's method requires so many arithmetic (k) operations is that it does not exploit the fact that the iterates V are converging quadratically to V. The sequence of systems (2.H) is solved as if the systems were unrelated, when, in fact, their solutions rapidly (k) approach zero as V converges to V. To exploit this situation, we could use an iterative method to approximate the solution of (2.U) at each step. Most Newton-iterative methods require only 0(N) storage locations for the model problem, and some require fewer arithmetic operations than Newton's method. Unfortunately, however, for the model problem the rates of 9 convergence of most Newton-iterative methods depend on the mesh spacing h (see Section 5) in such a way that they have higher asymptotic arithmetic costs than Newton's method. For example, even though Newton-SOR converges linearly, the rate of convergence depends on h, and the method requires 1/2 2 0(N log N) iterations to reduce the error by a factor of h (see Section 5). If the rate of convergence were independent of h, then only O(log N) iterations would be required. We now present the Newton-Richardson methods for the model problem. These methods require more storage than most other Newton-iterative methods (0(N log N) locations), but they converge at rates which are independent of h. The methods are based on the principles embodied in the Strongly Implicit Procedures developed by Stone [lU], Dupont [6], Diamond [3], and others. The basic idea is to use sparse Gaussian elimination to obtain 6 at the first step and to compute 6 at succeeding steps with a Richardson-D' Jakonov iteration (cf. Richardson [12] , D' Jakonov [k]) involving the factorization of J(V ). In particular, suppose that at the k+l-st step (k >, l) we wish to solve the system j(V (k) ) 6 (k) = b (k) . (3.1) Since we have already factored J(V ) at the first step, the system J(V ) w = d can be solved with 0(N log N) arithmetic operations. We can then solve (3.1) by using a Richardson-D 'Jakonov iteration to generate a sequence of iterates 6 ,8 , 8 , ... which converges to 6 . (The idea here is t It is also possible to use a scaled conjugate gradient iteration here (cf. Douglas and Dupont [5])» but the analysis is more complicated, and, in any case, the results are improved by at most a constant factor. 10 similar to that of iterative improvement, cf. Forsythe and Moler [9]» pp. U9-5M • For 6 =0 and a fixed real constant y, the Richardson- D'Jakonov iterates satisfy Als.) .. .(k) (k) 6 =6. - ye. ~i+l ~i ~i (k) where e. is the solution to the linear system ~i J(V (0) ) ef k) = j(V (k) ) 6< k) -b (k) . ~i ~ ~i Each step of the iteration requires 0(N log N) arithmetic operations, and (k) the iterates will converge linearly to 6 if p = ||I- yJtV^rW 10 )!! < 1 (cf. Young [15], p. 77). To maximize the rate of convergence, we choose y to minimize p. If all the eigenvalues of J(V ) j(V ) lie in the interval [a,g] , this is accomplished by choosing 2 y = a+3 ' yielding p = (g - a)/(a + 3). For the model problem, p can be made independent of h by choosing V near enough to V (see Section 5), so the Richardson-D' Jakonov iteration will t converge linearly, independent of the mesh size. Different Newton-Richardson methods are obtained by varying the number of Richardson-D' Jakonov iterates computed at each step of the outer k iteration. Of particular interest are the case in which 2 iterates are t It has been shown by several authors (.cf. Young [15]) that the rate of convergence is increased by a constant factor by using Tchebychev acceleration to choose a sequence of parameters {y v l instead of the single parameter y. 11 computed at the k-th step and the case in which just one iterate is computed at each step. In Section 5 we will show that if V is sufficiently close to V, then the first of these methods converges quadrat ically to V, while the second, which reduces to a relaxed chord method, converges linearly. However, in either case the solution of the model problem requires a total of 0(N(log N) ) arithmetic operations in addition to the cost of the initial factorization of J(V ) (which we treat as preprocessing). 12 h. Convergence Results In this section we present an analysis of the local convergence properties of Newton-iterative methods. We will show that it is possible to adjust the rate of convergence of such methods by varying the number of inner iterations taken at each step and that by taking 2 inner iterations at the k-th step, quadratic convergence is obtained. Since the Newton- Richardson methods described in the last section are a particular instance of Newton-iterative methods, the convergence results for them follow immediately. The proofs of the results described here are contained in the appendix. We begin by restating our problem in a more precise and formal way. N N If R is a bounded subset of R and F: R -»■ R is a nonlinear function, we consider the solution of F(x) = 0, (U.l) * where we assume that a solution x exists. We also assume that there is an ii *i i r > such that if S = {x: | |x - x | < r }, then (i) F is differentiable in S ; 3F * (ii) F'(x) = — (x) is nonsingular at x ; ~ oX ~ (iii) There is an L < + °° such that for x e S_ , | |F'(x) - F'(x*)| | <: L| |x - x*| |. It can be shown that the discretized model problem of Section 2 satisfies (i) - (iii) (cf. Greenspan and Parter [10]), so that we may directly apply the theory developed here. In the next section we will consider the model problem as a practical illustration of the theory. From assumptions (i) - (iii), it follows that there is a < r_ £ I'- ll *i i such that F' is continuous and nonsingular in S ] = {x: |x - x | < r.. } 13 (cf. Ortega and Rheinboldt [ll], p. U6). Thus Newton's method can be used * (0) to solve (U.l) for x . For a given x e S. , we generate the sequence of iterates x , x , x , ... "by 5 (»D.. ,(*)■■_ jOO, (u.2, where F'(x (k) ) 6 (k) = F(x (k) ). (k.3) It can he shown that for some < r £ r , Newton's method converges * (0) ii *, | quadratically to x for x e S = {x: |x - x | | < r_} (cf. Ortega and Rheinboldt [ll], p. 312). Newton-Richardson methods are a specific instance of general Newton- iterative methods in which a linear iterative method is used to approximate the solution of (k.3) at each step. A common way of generating Newton- iterative methods involves splitting F' (x) by writing it as F'(x) = B(x) - C(x), (k.k) where (iv) B(x) is continuous at x and (v) B(x ) is nonsingular. Furthermore, if the splitting (U.U) is to lead to a computationally efficient method, it must be possible to quickly solve the system B(x) w = d. (k.5) It follows from (iv) - (v) that there is a < r_ £ r p such that B is *i continuous and nonsingular in S = {x: |x - x | < r } (cf. Ortega and Rheinboldt [ll], p. k6)\ and if we let H(x) = B(x)" 1 C(x) = I - B(x)" 1 F(x), it then follows that H is continuous in S_. Ik The Newt on- Richards on methods are obtained with the splitting (U.U) in which B and C are defined by B(x) = - F'(x (0) ), C(x) = B(x) - F'(x), where y is a real constant satisfying < y < 2 and x e S_ . Clearly this splitting satisfies (iv) and (v) , and once F' (x ) has been factored at the first step, the system (h.3) can be solved quickly. At each step of a Newton-iterative method, we use the linear iteration to generate a sequence of inner iterates 6 ,6 , 6^ , ... in which S XJ = and .00 _ .00 (k) 6. .- - 0. - E . , -1+1 ~1 -i where B(x (k) ) 5 < k) = F'(x (k) ) «< k) -F(x (k) ). For any positive integer m, let A (x) = (I + H(x) + H(x) 2 + ... + H(x) 111 " 1 ) B(x) -1 m ~ ~ ~ = (I - H(x) m )(l - H(x))' 1 B(x)" 1 = (I - H(x) m ) F'(x) -1 . Then letting and we have B k = B(x (k) ) H k = H(x (k) ), 15 |W = 8< k! -B-V(x W ) «! k) -F(x (k) )) ~i+l ~i k - ~i = IL 6< k) ♦ BT 1 r(» W ) k ~i k = H^ 6^ + B" 1 F(x (k) )] + B" 1 F(x (k) ) = H k^ +(l + H k )B k _1 ^ (k)) - H^ +1 & { k) + (I + H k + ... + h£) B" 1 F(x (k) ) = (I + H k + ... + h£) B" 1 F(x (k) ) = A i+1 (x (k) )FU (k) ). At the k-th Newton-iterative step, we define 6 (k) _ 6 (k) for some positive integer m. Letting G (x) = x - A (x) F(x) , m - ~ m ~ - we have -o (x (h) ). \ - If | |h(x )| < 1, then under assumptions (i) - (iii) it can he shown that for some < r, s r , the outer iterates {x } defined hy Qi.6) exist and * (0) II * i i converge to x for x e S, = {x: [ |x - x < r, } (cf. Ortega and Fheinboldt [11 ], pp. 350-351 ). + t It actually suffices that the spectral radius of HCx ) satisfy p(H(x*)) < 1, hut in order to avoid problems in relating norms to the spectral radius , we give all our results in terms of the Euclidean norm, (if H(x*) is symmetric, then of course we have I |H(x*)|| = p(h(x*)).) 16 The rate at which the Newton-iterative iterates converge to x depends mainly on | |H(x )| and the sequence {m, }. To obtain our main convergence result, we make two further assumptions on the splitting (U.U): ii*ii (vi) For some constant A, < A < 1, |[H(x)| < A; (vii) There are an L < + °° and a < r £ r, such that for ii *i i xeS= {x: | | x - x || < r } , |B(x) - B(x )| | * L 1 | *l X - X ' We now state our main convergence result, deferring its proof to the appendix. Theorem k.l : Let F satisfy (i) - (iii) and let F' (x) = B(x) - C(x) be a splitting which satisfies (iv) - (vii). Let m , hl , m , ... be a sequence of positive integers, and define k-1 m = max [U,m } u {(m, - I m ): k = 1,2,3,...}]. U * 1=0 l Then, if m < + °°, there are a < r <; r (depending on X, H, and F) and a positive constant c (depending on A and m) with X < c < 1 such that for (0) *i i x e S = {x: |x - x || < r}, the sequence of iterates defined by (h.6) satisfies I | (k+1) »j. "k ,, 00 *, I ||x -x||£c |x - x I I . While the restriction on m in Theorem U.l may seem contrived, it simply reflects the fact that Newton-iterative methods cannot be expected to exhibit super-quadratic convergence. Hence choosing ni > 2 is pointless, If we choose m = 1 or iil = 2 , then m = 1 in the theorem, and we have the following corollaries. IT Corollary U.2 : Let F satisfy (i) - (iii) and let F 1 (x) = B(x) - C(x) be a splitting which satisfies (iv) - (vii). Define the sequence {m.: k = 0,1,2,...} by nL = 1. Then there are a < r <: r (depending on X, H, and F) and a positive constant c (depending on X) with X < c < 1 such (0) ii *, | that for x e S = {x: | |x - x | < r}, the sequence of iterates defined by (h.6) converges linearly, i.e., I I (k+1) *i i ii (k) *, | [ |x - x I | £ c| [x - x | | . Corollary U.3 : Let F satisfy (i) - (iii) and let F' (x) = B(x) - C(x) be a splitting which satisfies (iv) - (vii). Define the sequence {hl : k = 0,1,2,.'. . } by m. = 2 . Then there are a < r £ r (depending on X, H, and F) and a positive constant c (depending on X) with X < c < 1 such (0) ii *, | that for x e S = {x: | |x - x | < r}, the sequence of iterates defined by (h.6) converges quadratically , i.e., II (k+1) *,, 2 k M (k) *,, X - X £ C X - X . 18 5. Ramifications of the Theory for the Model Problem Corollary" U.2 expresses the previously known result that a Newton- iterative method converges linearly for m = 1 (cf. Ortega and Rheinboldt [ll], p. 350). The real import of Theorem k.l is the rather surprising result of Corollary U. 3, i.e., that we can obtain R-quadratic convergence with a Newton-iterative method by choosing hl = 2 . Since Newton's method itself is only quadratically convergent (cf. Ortega and Rheinboldt [ll] , p. 312) this is the fastest convergence that can be expected for any Newton-iterative method. A problem with the result of Theorem U.l is that c depends on X: specifically, A < c < 1. To reduce the initial error |v - v| j by a factor of K < 1 requires I outer iterations , where l-l m K * tt c . (5.1) k=0 Taking logarithms , we obtain l-l log K ~ ( Z m )(log c), k=0 K so that I satisfies l-l I til * log K/log c ^ log K/log X. k=0 k As an illustration, we consider the model problem of Section 2. For Newton-SOR methods it can be shown that even with the optimal SOR parameter co, log X ~ -2iTh (cf. Young Il5], p. 190). Reducing the initial 2 error by a factor of the discretization error h requires I iterations, where £ satisfies 19 U1 2 E hi ~ log h /log X k=0 K 8 (-log N)/(-2irh) I /p = 0(N ' log N). Thus if hl = 2 , we have ^"1 -l /p £ m = 2 = 0(N ' log N) , k=0 so that & = 0(log N). And if bl = 1, then we have 4 " 1 1/2 Z hi = H = 0(N ' log N). k=0 k In either case, though, since each inner SOR iteration requires 0(N) opera- 3/2 tions (cf. Young [l5J)j a total of 0(N log W) iteration time is required, 3/2 as compared to the 0(N log log N) time required by Newton's method. For Newton-Richardson methods, however, the situation is quite different. We have ||h(x*)|| = MbCx*)- 1 C(x*)|| YF'Cx^^" 1 [( h ) F'(x (0) ) - F'(x*)] Y i - yF'U^V 1 f«(x*)||. Since F r (x) is continuous in S , we may make | |H(x )| arbitrarily close i i i i (0) *i i to |1 - y| by choosing | |x - x | | small enough. In particular, we can always make |h(x )| < (l + |l - y\)/2 < 1. For the model problem, then, the constants X and c of Theorem k.l are independent of h for Newton- 2 Richardson methods. Thus to reduce the initial error by a factor of h requires I outer iterations, where £-1 2 I m ~ log h /log c = -log N/log c k=0 K 20 and log c is a constant independent of N. Hence if itl = 2 , we have 2 n. s 2 = -log N/log c, k=0 so that £ = 0(log log N). And if m. = 1, then we have £-1 I m = I ~ -log N/log c = 0(log N). k=0 k This leads to the following result on the cost of solving the model problem with Newton-Richardson methods (for a proof, see Eisenstat, Schultz, and Sherman [T]» Sherman [13]). Theorem 5.1 : Consider the use of Newton-Richardson methods to solve the model semilinear problem presented in Section 2, and assume that V and Y are chosen so that Theorem k.l holds for c independent of h. Then ignoring the costs of preprocessing and the factorization at the first step, we may reduce the initial error | |V - V| ] by a factor of h in 0(N(log N) ) iteration time by choosing either itl = 2 or m, = 1. From Theorems k.l and 5.1 we conclude that the choice of m, for k Newton-Richardson methods should be based on the relative costs of function and derivative evaluations. For the model problem, choosing itl = 2 requires 0(log log N) function evaluations and 0(log log N) derivative evaluations, while choosing m = 1 requires 0(log N) function evaluations and only one derivative evaluation (i.e., F' (V )). Since function evaluations and derivative evaluations are equally costly for the model problem ( — (V ) is a diagonal matrix) , it seems best to choose m, = 2 . However, we have observed experimentally that for other problems it may be more efficient computationally to choose itl = 1 (cf. Sherman [13]). The reason for this is that derivative evaluations are often far more 21 costly than function evaluations for problems more general than the model problem. 22 6. Conclusion In this paper we have given a general analysis of the local rates of convergence of Newton-iterative methods for the solution of systems of nonlinear equations. In addition, we have presented the Newton-Richardson methods, a particular class of Newton-iterative methods which appear to have computational advantages over both Newton's method and other Newton- iterative methods for sparse nonlinear systems. In practice these methods have performed quite well (cf. Eisenstat, Schultz and Sherman [7], Sherman [13]), although they are by no means good methods for all problems. It is clear that the choice of a method for a particular nonlinear system depends heavily on the nonlinear operator involved and the context in which it arises . In closing, it is worth pointing out that the ideas underlying the Newton-Richardson methods have important applications in many other settings. Already, a number of authors (e.g., Bank [l], Concus and Golub [2], Douglas and Dupont [5]) have used SIP-like iterations to solve linear systems arising in the numerical solution of partial differential equations, and in the future similar techniques will no doubt be applied to a wide variety of nonlinear and time-dependent problems. In fact, it would seem that the idea of solving difficult problems by iterating on the solutions of nearby, easily-solved problems is one which should prove fruitful whenever it can be applied in a natural way. 23 Appendix: Proofs of Convergence Results In this appendix we shall prove Theorem k.l. We will make frequent use of the assumptions and results of Section U, and the reader is advised to refer to that section, particularly for the statements of assumptions (i) - (vii). Much of our analysis follows that of Ortega and Rheinboldt [11], pp. 350-352. Lemma A.l : Let F satisfy (i) - (iii) and let F'(x) = B(x) - C(x) he a splitting which satisfies (iv) - (vi). Then there is a < r^ $ r, such that for x e S^- = {x: | |x - x | < *>}, I |h(x)|| * a and for any positive integer m, ||H(x) m - H(x*) m || v < mA^llHCx) - H(x*)||. Proo f : Since H is continuous in Si , we can choose r^ £ r, so that | H(x) I £ A for x e S^-. The second part of the lemma follows by induction. It holds trivially for m = 1; and if it holds for m = k, then | |H(x) k+1 - H(x*) k+1 | | = | |H(x) k (H(x) - H(x*)) + (H(x) k - H(x*) k )H(x*) | | S X k ||H(x) - H(x*)| | + kA 1 "" 1 ! |H(x) - H(x*)|| | |h(x*)| $ (k+l)A k ||H(x) - H(x*)||. D Lemma A. 2 : Let F satisfy (i) - (iii) and let F 1 (x) = B(x) - C(x) he a splitting which satisfies (iv) - (vi). Then there are positive constants c, < + °° and c < + °° and a < r £ r^ such that for any positive integer ii *i i m and xeS={x: | |x - x | | < r } , |G (x) - x I I <: cJ |x - x I I 2 + c | |H(x) - H(x )| I |x - x I + A m | |x - x I Proof : From (iv) - (vi) and Lemma A.l, it follows that we may choose r_ 2k so that there is a B < + °° such that for x e S , | |h(x) | £ A and | |b(x) _1 | I <: 6 (cf. Ortega and Rheinboldt [ll], p. 351). Then for x e S we have I |G (x) - x*| | * | |x - A (x) F(x) - X* - H(x*) m (x - x*) I I + I |H(x*) m (x - x*) = ||-A m (x) F(x) + (I - H(x*) m )(x - x*)|| + ||H(x*) m (x - x) \ \ = ll-A (x) F(x) + A (x*) F'(x*)(x - x*) I I + I |H(x*) m (x - x*) I AA y. JJ£ S | -A (x)[F(x) - F(x ) - F' (x )(x - x )]|| + ||A m (x)[F'(x) - F'(x*)](x - x*)|| + | | [A (x) F'(x) - A (x*) F'(x*)](x - x*)|| + A m ||x - x*| m ~ ~ ra~ - ~ ~ ' ~ ~ Using I I A (x)|| = | | (I + H(x) + H(x) 2 + ... + H(x) m " 1 )B(x)' * (1 + A + A 2 + ... + A m_1 )| * 1-A » we can bound the terms of (A.l) individually. From the mean value theorem we obtain ll-A (x)[F(x) - F(x*) - F'(x )(x - x )]|| < I A (x) I |x - x | sup |F»(x + t(x - x )) - F' (x ) I m~ " " _ , _ ~ ~~ ~ QStSl $ |A (x) I |x - x I sup L |t(x - x ) m ~ 0$t*l * 6L/(1 - A) i |x - x I | 2 . The second terra of (A.l) is bounded by 25 I |A m (x)[F'(x) - F«(x*)](x - x*)|| $ ll A m (x)| | ||F'(x) - F'(x*)| | ||x - x*| | $ PL/(1 - X) | |x - x | | 2 . Lemma A.l now yields | | [ A (x) F« (x) - A (x*) F' (x*) ] (x - x*) | | = | | (H(x*) m - H(x) m ) (x - x*j | | I ' m ~ ~ m~ ~ *• «• ~ ~ ~ ~ ' * mX* 1 " 1 | |H(x) - H(x*)| | ||x - x*||. Since X < 1 , we can choose c = c (X) < + °° so that the set {mX } is bounded by c . Letting c = c (X, r , 8, L) = 2BL/(l - X) and combining terms, we obtain ii * i i ii ^ I i ? 11/. /^>iiii * i i TTlii * i i G (x) - x S a. x - x r + cj H(x) - H(x x - x + X x - x . □ II m - - " 1 ' ~ ~ 2. - ~ i>i' r - 11 M - ~ ' ' Assumption (vii) is not the only restriction which could be used to prove Theorem U.l, but it is one which the Newton-Richardson methods satisfy trivially since B(x) is a constant function. From assumptions (iii) and (vii), it follows that C(x) also satisfies a Lipschitz condition. Thus, since we have already noted that B is continuous and nonsingular and B is uniformly bounded in S 7 , it follows that there are an L ] < + °° and a < r <: min(r,_, r„) such that for x e S Q = {x: x - x < r D } o 5 7 -.(j-ii~-i" H | |H(x) - H(x ) | | £ L | |x - x | |. Using this result, we now prove Theorem k.l. Proof of Theorem k.l : Let 8, c , c , and r be defined as in Lemma A. 2 and define c(t) = ((c x + c 2 L ] _)t + X 1 ) 1/m . We select r < r ft so that c(r) % ii (°) *n <(c - X + X ) X - X II. m. „ „ , , m But 0 -X* t (o 1 *e 2 L 1 ) ||x (J) -x*|| 2 + » J ||xo Signature c r Date December, 1975 FOR AEC USE ONLY 7. AEC CONTRACT ADMINISTRATOR'S COMMENTS, IF ANY, ON ABOVE ANNOUNCEMENT AND DISTRIBUTION RECOMMENDATION: 8. PATENT CLEARANCE: Q a. AEC patent clearance has been granted by responsible AEC patent group. LJ b. Report has been sent to responsible AEC patent group for clearance. I I c. Patent clearance not required. IBLIOGRAPHIC DATA HEET 1. Report No. UIUCDCS-R-75-772 3. Recipient's Accession No. Title and Subtitle On Newton-Iterative Methods for the Solution of Systems of Nonlinear Equations 5- Report Date December 1975 6. Author(s) Andrew H. Sherman 8- Performing Organization Rept. No - UIUCDCS-R-75-772 Perl. Hining Organization Name and Address Department of Computer Science University of Illinois at Urh ana-Champaign Urbana, IL 6l801 10. Project/Task/Work Unit No. 11. Contract /Grant No. US AEC AT (11-1) 2383 msoring Organization Name and Address US AEC Chicago Operations Office 98OO South Cass Avenue Argonne, IL 60^39 13. Type of Report & Period Covered 14. S. supplementary Notes 6. Abstracts In this paper we consider the local rates of convergence of Newton-iterative methods for the solution of systems of nonlinear equations. ¥e show that under certain conditions on the inner, linear iterative method, Newton-iterative methods can be made to converge quadratic ally by computing a sufficient number of inner iterates at each step. As an example of this phenomenon, we examine the Newton- Richardson methods which use an inner Richardson-D' Jakonov iteration. 7. Key Words and Document Analysis. 17o. Descriptors Nonlinear equations Newton- iterative methods Newton-Richardson method Iterative methods 7b. Identifiers /Open-Ended Terms 7c. COSAT1 Field/Group 3- Availability Statement Unlimited 3RM NTIS-35 (10-70) 19. Security Class (This Report) UNCLASSIFIED 20. Security Class (This Page UNCLASSIFIED 21. No. of Pages 35 22. Price USCOMM-DC 40329-P71