Local convergence analysis of the Gauss-Newton-Kurchatov method
Ioannis K. Argyros, Stepan Shakhno

TL;DR
This paper analyzes the local convergence of the Gauss-Newton-Kurchatov method for nonlinear least squares problems, improving convergence region and accuracy over previous results through refined estimates and weaker hypotheses.
Contribution
It provides an enhanced convergence analysis of the Gauss-Newton-Kurchatov method, extending the convergence region and improving solution accuracy under weaker assumptions.
Findings
Extended convergence region compared to previous results
Finer error estimates and solution localization
Numerical examples confirm theoretical improvements
Abstract
We present a local convergence analysis of the Gauss-Newton-Kurchatov method for solving nonlinear least squares problems with a decomposition of the operator. The method uses the sum of the derivative of the differentiable part of the operator and the divided difference of the nondifferentiable part instead of computing the full Jacobian. A theorem, which establishes the conditions of convergence, radius and the convergence order of the proposed method, is proved (Shakhno 2017). However, the radius of convergence is small in general limiting the choice of initial points. Using tighter estimates on the distances, under weaker hypotheses (Argyros et al. 2013), we provide an analysis of the Gauss-Newton-Kurchatov method with the following advantages over the corresponding results (Shakhno 2017): extended convergence region; finer error distances, and an at least as precise information on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIterative Methods for Nonlinear Equations · Advanced Optimization Algorithms Research · Numerical methods in inverse problems
Local convergence analysis of the Gauss-Newton-Kurchatov method
Ioannis K. Argyros1, Stepan Shakhno2
1Department of Mathematics, Cameron University,
Lawton, USA, OK 73505;
2Department of Theory of Optimal Processes,
Ivan Franko National University of Lviv,
Lviv, Ukraine, 79000;
Abstract
**Abstract. ** We present a local convergence analysis of the Gauss-Newton-Kurchatov method for solving nonlinear least squares problems with a decomposition of the operator. The method uses the sum of the derivative of the differentiable part of the operator and the divided difference of the nondifferentiable part instead of computing the full Jacobian. A theorem, which establishes the conditions of convergence, radius and the convergence order of the proposed method, is proved (Shakhno 2017). However, the radius of convergence is small in general limiting the choice of initial points. Using tighter estimates on the distances, under weaker hypotheses (Argyros et al. 2013), we provide an analysis of the Gauss-Newton-Kurchatov method with the following advantages over the corresponding results (Shakhno 2017): extended convergence region; finer error distances, and an at least as precise information on the location of the solution. The numerical examples illustrate the theoretical results.
**Keywords: ** Gauss-Newton-Kurchatov method, local convergence, Fréchet-derivative, Lipschitz / center-Lipschitz condition, convergence domain.
**AMS Classification: **65F20, 65G99, 65H10, 49M15
1 Introduction
Let us consider the problem of finding an approximate solution of the nonlinear least squares problem
[TABLE]
where the residual function , is nonlinear in , is continously differentiable, and is an open convex set in .
A large number of problems in applied mathematics and also in engineering are solved by finding the solutions of problem (1). For example, solving overdetermined systems of nonlinear equations, estimating parameters of physical processes by measurement results, constructing nonlinear regressions models for solving engineering, problems dynamic systems, etc. The used solution methods are iterative – when starting from one or several initial approximations a sequence is constructed that converges to a solution of the problems (1).
Known methods of the Gauss-Newton type (Dennis et al. 1996; Ortega et al. 1970; Argyros 2008; Shakhno 2001) are used to solve the problem (1), which have derivatives of function in their iterative formulas. However, in practice, problems with calculations of derivative arise. In this case, we can use iterative-difference methods (Argyros 2008; Ren et al. 2010, 2011; Shakhno et al. 1999, 2005) that do not require the calculation of the matrix of derivatives and often are not inferior over the Gauss-Newton method at the order of convergence and the number of iterations. But sometimes the nonlinear function consists of differentiable and non-differentiable parts. Then a nonlinear least squares problem arises
[TABLE]
where the residual function , , is nonlinear in , is continously differentiable, is continous function, differentiability of which, in general, is not assumed, and is an open convex set in . Although it is possible to apply iterative-difference methods for solving a nonlinear problem (2), but it is also possible to construct iterative methods that take into account the decomposition of the residual function. In this case, when solving nonlinear equations, methods (Shakhno et al. 2014, 2011; Shakhno 2016; Cătinaş 1994; Iakymchuk et al. 2016) were constructed as combinations of the Newton method (Dennis et al. 1996; Ortega et al. 1970; Argyros 2008; Deuflhard 2004) and iterative-difference methods of chord (secant) and Kurchatov (Dennis et al. 1996; Ortega et al. 1970; Shakhno 2006, 2007; Argyros 2008; Ren et al. 2010, 2011; Shakhno et al. 2005).
In the paper (Shakhno 2017), we proposed a method for solving a nonlinear problem of least squares with a non-differentiable operator (2) constructed on the basis of the Gauss-Newton method method (Dennis et al. 1996; Ortega et al. 1970) and the Kurchatov type method (Shakhno et al. 2011, 2005; Ren 2011). We studied its local convergence under Lipschitz conditions and showed its effectiveness in comparison with other methods using test problems.
2 Preliminaries
To find the solution of the problem (2) we consider the Gauss-Newton-Kurchatov method (Shakhno 2017):
[TABLE]
where is matrix of Jacobi of ; is the divided difference of the first order of functions (Ulm 1967), and the points ; , are initial approximations. Method (3) is a combination of the Gauss-Newton method (Dennis et al. 1996; Ortega et al. 1970) and the Kurchatov type method (Shakhno et al. 2011, 2005; Ren 2011).
If , method (3) reduces to the Newton-Kurchatov method for solving the nonlinear equation (Shakhno et al. 2016, 2015; Hernández-Verón 2017; Iakymchuk et al. 2016):
[TABLE]
Setting in (3) , we obtain a combination of the Gauss-Newton method (Dennis et al. 1996; Ortega et al. 1970) and the Secant type method (Ren et al. 2010; Shakhno et al. 2005) of the form (Shakhno et al. 2017)
[TABLE]
We need the following Lipschitz conditions.
Definition 2.1.
We say that the Fréchet derivative satisfies the center Lipschitz conditions on , if there exist such that for each
[TABLE]
where solves problem (2).
Definition 2.2.
We say that divided differences and satisfy the special Lipschitz conditions on and , if there exist and such that for each
[TABLE]
and
[TABLE]
Let and . Define function on by
[TABLE]
Suppose that equation has at least one positive solution. Denote by the smallest such solution. Set .
Definition 2.3.
We say that the Fréchet derivative satisfies the restricted special Lipschitz conditions on , if there exist such that for euch
[TABLE]
Definition 2.4.
We say that divided differences and satisfy the special Lipschitz conditions on and , respectively, if there exist and such that for each
[TABLE]
and
[TABLE]
The following condition together with (7) and (8) have been used instead of the preceding ones in the study of such iterative methods (Shakhno 2017).
Definition 2.5.
We say that the Fréchet derivative satisfies the Lipschitz conditions on , if there exist such that for euch
[TABLE]
Let
3 Convergence analysis of the iterative process (3)
Next, we improve Theorem 1 (Shakhno et al. 2017).
Theorem 3.1.
Let function be continuous on the open subset , continuously differentiable in this domain, and let be a continuous function. Assume that the problem (1) has a solution in the domain and there exist the inverse operator and
[TABLE]
Estimates (6), (7), (8), (10), (11), (12) hold and given by (9) exists,
[TABLE]
[TABLE]
[TABLE]
where is unique positive zero of the function , given by
[TABLE]
Then for the iterative process (3) is well defined, the sequence , , generated by it, remains in the open subset , and converges to the solution . Moreover, the following error estimates hold for
[TABLE]
where
[TABLE]
[TABLE]
[TABLE]
[TABLE]
Proof. According to the intermediate value theorem on the function for a sufficiently large and by (15) has a positive zero denoted by . But for So, this root is the only one on .
By assumption . Then we have
[TABLE]
[TABLE]
So, .
Let’s denote . Let and we will get this estimate:
[TABLE]
Using (8), we get
[TABLE]
and
[TABLE]
We use inequalities (7), (20), (21):
[TABLE]
Then
[TABLE]
Then we obtain from the inequality (19) and the definition (16)
[TABLE]
By Banach’s theorem on the inverse operator (Ortega et al. 270) there exists and we have from (24)
[TABLE]
[TABLE]
[TABLE]
Consequently, iterate is well defined.
Then let’s show that . Using equality
[TABLE]
we will get an estimate
[TABLE]
[TABLE]
[TABLE]
[TABLE]
Hence, taking into account (21), (23) and inequalities
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
we will get
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
where
[TABLE]
[TABLE]
Hence, and inequality (16) is true for .
Assume that for , and the estimate (17) for , where 1 is an integer, holds. Next we prove that , and the estimate (17) holds for .
Define
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
So, exists and
[TABLE]
[TABLE]
Therefore, the iteration is well defined, and we can get in turn
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
i.e. , and estimate (17) holds for
Consequently, the iterative process (3) is well defined, for all , and estimate (17) holds for all .
Next, we prove that for . Define functions and on by:
[TABLE]
where .
According to the choice , we have
[TABLE]
Using the estimate (17), the definition of constants , as well as the functions and , for , we obtain
[TABLE]
Similarly to (Ren at al. 2011), we prove that under the conditions (25), (26) the sequence for converges to .
First of all, for a real number and initial points there exists a real number such that , . Then all the above estimates for the sequence are valid, if replaced by . In particular, from (27) for , we get
[TABLE]
where , .
Clearly, we also have
[TABLE]
Define sequences , :
[TABLE]
We divide the two parts of inequality (28) into and obtain .
By definition of the sequence , we have
[TABLE]
For the sequence known explicit formulas
[TABLE]
where
[TABLE]
and
[TABLE]
Note that
[TABLE]
Taking into account (30) and (31), we conclude that as . Therefore, we conclude that as . {\hbox to0.0pt{\sqcup\hss}\displaystyle\sqcap}
Remark 3.2.
If , and , our results specialize to the corresponding ones (Shakhno 2017). Otherwise they constitute an improvement. As an example let used in (Shakhno 2017) denote the functions and parameters, where are replaced by , respectively. Then, since , , , and since , we have , , , , , , so , and the new error bounds are tighter than the corresponding ones (23) (Shakhno 2017) .
Moreover, we have
[TABLE]
but not vice versa, unless if and .
Hence, the new sufficient convergence criteria for method (3) are weaker. These advantages are obtained under the same computational cost as (Shakhno 2017), since in practice the new constants are special cases of the previous ones.
Corollary 3.3.
In the case of zero residual, the convergence order of the iterative process (3) is quadratic.
If , we have a nonlinear least squares problem with zero residual in the solution. Then the constants and and (17) reduces to
[TABLE]
It follows from the inequality (32) that the order of convergence (3) is not higher than quadratic. Consequently, there exist a constant and a positive integer such that for all
[TABLE]
By
[TABLE]
we have
[TABLE]
and from (32) we have
[TABLE]
[TABLE]
Consequently, the convergence order of the iterative process (3) is quadratic.
As we see from the estimates (17) and (18), the convergence of the iterative process (3) essentially depends on the terms containing the values , , , and .
For problems with zero residual in the solution (), the quadratic convergence of the iterative process (3) is established.
For problems with a small residual in the solution ( – ”small”) and with weak nonlinearity (, , , and – ”small”), the convergence of the iterative process is linear. In the case of large residual ( – ”large”) or for strongly nonlinear problems (, , , and – ”large”), the iterative process (3) may not converge at all.
4 Results of numerical experiment
On several test cases, we compare the convergence rates of the Gauss-Newton-Kurchatov method (3), the Gauss-Newton-Secant method (5 ) and the Secant-type difference method** **(Ren et al. 2010; Shakhno et al 2005)
[TABLE]
and the Kurchatov-type difference method** (Ren et al. 2011; **Shakhno et al. 2005)
[TABLE]
We tested methods on nonlinear systems with a non-differentiable operator with zero and non-zero residuen. The classical Gauss-Newton method and the Newton method cannot apply to solving these problems.
Solution results are with accurate . The additional approximation was chosen as follows: . The calculations were carried out until the conditions were fulfilled
[TABLE]
with .
Example **1 **(Shakhno et al. 2014; Argyros 2008; Cătinaş 1994):
[TABLE]
[TABLE]
Example 2. **:
[TABLE]
[TABLE]
Table 1 shows the results of a numerical experiment. In particular, the investigated methods are compared by the number of iterations performed to find a solution with a given accuracy.
Table 1. Number of iterations for solving of the test problems
[TABLE]
5 Conclusions
It follows from the theoretical results, practical calculations and comparison of the results obtained, that the combined differential-difference methods (3) and (5) converge faster than the Kurchatov type method (34) and the Secant type method (33). As proved, in the case of zero residual, method (3) has a quadratic order of convergence and does not require the calculation of derivatives from a non-differentiable part of the operator. Then the method (3), as well as the method (5), are effective methods for solving nonlinear least squares problems with non-differentiable operator.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Argyros, I.K.: Convergence and applications of Newton-type iterations. Springer-Verlag, New York (2008)
- 2[2] Argyros, I. K., Hilout, S.: On an improved convergence analysis of Newton’s method. J. Applied Math. Comp. 225, 372-386 (2013)
- 3[3] Argyros, I. K., Magrenan, A.A.: A contemporary study of iterative methods: Convergence, Dynamics and applications. Acad. Press, Elsevier, London (2018)
- 4[4] Cătinaş, E. On some iterative methods for solving nonlinear equations . Revue d’Analyse Numér. Théor. de l’Appr. 23, 47–53 (1994)
- 5[5] Dennis, J. E. (Jr.), Schnabel, R. B.: Numerical methods for unconstrained optimization and nonlinear equations. SIAM, Philadelphia (1996)
- 6[6] Deuflhard, P. Newton methods for nonlinear problems. Affine invariance and adaptive algorithms. Springer-Verlag, Berlin ( 2004)
- 7[7] Hernández-Verón, M.A., Rubio, M.J.: On the local convergence of Newton–Kurchatov-type method for non-differentiable operators. Appl. Math. Comp. 304, 1–9 (2017) https://doi.org/10.1016/j.amc.2017.01.010
- 8[8] Iakymchuk, R., Shakhno, S., Yarmola, H.: Combined Newton-Kurchatov method for solving nonlinear operator equations. Proc. Appl. Math. Mech. – 16, 719–720 (2016) https://doi.org/10.1002/pamm.201610348
