A new alternating direction trust region method based on conic model for solving unconstrained optimization
Honglan Zhu, Qin Ni, Chuangyin Dang

TL;DR
This paper introduces a novel alternating direction trust region method based on a conic model for unconstrained optimization, improving solvability and efficiency, especially for large-scale problems.
Contribution
It proposes a new conic model trust region subproblem solved via an alternating direction approach, overcoming previous difficulties and establishing global convergence.
Findings
Outperforms the dogleg method in numerical experiments
Effective for large-scale unconstrained optimization problems
Demonstrates better solvability of the subproblem
Abstract
In this paper, a new alternating direction trust region method based on conic model is used to solve unconstrained optimization problems. By use of the alternating direction method, the new conic model trust region subproblem is solved by two steps in two orthogonal directions. This new idea overcomes the shortcomings of conic model subproblem which is difficult to solve. Then the global convergence of the method under some reasonable conditions is established. Numerical experiment shows that this method may be better than the dogleg method to solve the subproblem, especially for large-scale problems.
| No. | Problem | No. | Problem |
|---|---|---|---|
| 1 | Cube | 2 | Penalty-I |
| 3 | Beale | 4 | Conic |
| 5 | Extended powell | 6 | Variably Dimensioned |
| 7 | Rosenbrock | 8 | Extended Trigonometric |
| 9 | Tridiagonal Exponential | 10 | Brent |
| 11 | Troesch | 12 | Cragg and Levy |
| 13 | Broyden Tridiagonal | 14 | Brown |
| 15 | Discrete Boundary Value | 16 | Extended Trigonometric |
| No. | Iter | CPU (s) | ||||
|---|---|---|---|---|---|---|
| 1 | 2 | 52 | 53/43 | 1.0377e-15 | 1.9477e-06 | 0.064681 |
| 2 | 2 | 10 | 11/11 | 9.0831e-06 | 8.9419e-06 | 0.048493 |
| 3 | 2 | 18 | 19/18 | 9.0379e-15 | 9.3925e-07 | 0.053525 |
| 4 | 2 | 16 | 17/13 | 1.1407e-12 | 2.1360e-06 | 0.050445 |
| 5 | 4 | 41 | 42/34 | 4.8648e-09 | 4.5887e-06 | 0.062011 |
| 6 | 4 | 32 | 33/29 | 2.3856e-14 | 3.0965e-07 | 0.066287 |
| 7 | 2 | 50 | 51/49 | 1.5486e-14 | 5.4101e-06 | 0.064227 |
| 8 | 4 | 47 | 48/34 | 7.9158e-15 | 4.1153e-07 | 0.076865 |
| 9 | 4 | 7 | 8/8 | 8.1577e-12 | 4.5905e-06 | 0.058505 |
| 10 | 4 | 81 | 82/58 | 5.8024e-18 | 4.6604e-07 | 0.089702 |
| 11 | 4 | 59 | 60/51 | 1.0955e-13 | 2.7230e-06 | 0.077290 |
| 12 | 4 | 48 | 49/43 | 1.1247e-08 | 5.2578e-06 | 0.068215 |
| 13 | 4 | 35 | 36/19 | 1.4498e-11 | 5.0442e-06 | 0.063276 |
| 14 | 2 | 91 | 92/52 | 0.1998e-06 | 2.5916e-07 | 0.089294 |
| 15 | 4 | 23 | 24/15 | 2.0042e-12 | 8.2898e-06 | 0.061544 |
| 16 | 4 | 14 | 15/15 | 3.0282e-04 | 4.9068e-06 | 0.048488 |
| Solver | DCTR | ADCTR | |||||||
|---|---|---|---|---|---|---|---|---|---|
| No. | Iter | CPU (s) | Iter | CPU (s) | |||||
| 1 | 20 | 746 | 747/517 | 9.2593e-07 | 0.116584 | 100 | 101/92 | 2.1475e-06 | 0.079679 |
| 200 | 2387 | 2388/2023 | 2.6984e-08 | 2.377265 | 82 | 83/59 | 4.6090e-06 | 0.308822 | |
| 1000 | * | */* | * | * | 74 | 75/56 | 1.4596e-06 | 19.36706 | |
| 2 | 200 | 76 | 77/53 | 3.1715e-06 | 0.138719 | 79 | 80/53 | 4.1422e-06 | 0.137110 |
| 500 | 96 | 97/62 | 6.5292e-06 | 1.118043 | 78 | 79/54 | 4.6576e-06 | 1.270450 | |
| 1000 | 82 | 83/57 | 5.5181e-06 | 6.454441 | 86 | 87/57 | 8.8824e-06 | 8.026574 | |
| 3 | 2 | 20 | 21/19 | 4.3711e-07 | 0.042443 | 18 | 19/18 | 9.3925e-07 | 0.053525 |
| 20 | 24 | 25/19 | 2.4476e-07 | 0.042636 | 24 | 25/25 | 4.9198e-06 | 0.056501 | |
| 200 | 27 | 28/24 | 3.6411e-08 | 0.077384 | 26 | 27/27 | 2.0468e-07 | 0.145411 | |
| 2000 | 29 | 30/25 | 6.1805e-06 | 15.70090 | 35 | 36/28 | 8.8657e-06 | 54.47877 | |
| 4 | 20 | 15 | 16/14 | 5.7802e-07 | 0.039310 | 16 | 17/13 | 4.8378e-09 | 0.055520 |
| 200 | 16 | 17/12 | 3.7354e-06 | 0.051864 | 19 | 20/18 | 3.4444e-07 | 0.094543 | |
| 2000 | 18 | 19/19 | 3.9029e-07 | 13.50780 | 19 | 20/17 | 2.5377e-06 | 17.65469 | |
| 5 | 40 | 121 | 122/104 | 8.7449e-06 | 0.064345 | 48 | 49/43 | 6.0181e-06 | 0.076145 |
| 1000 | 121 | 122/116 | 2.3550e-06 | 14.01567 | 92 | 93/77 | 8.1189e-06 | 18.098231 | |
| 2000 | 121 | 122/116 | 2.6445e-06 | 106.0106 | 69 | 70/60 | 6.2005e-06 | 97.24934 | |
| 6 | 40 | 120 | 121/75 | 6.0183e-06 | 0.125199 | 145 | 146/116 | 7.2779e-06 | 0.115281 |
| 400 | * | */* | * | * | 1124 | 1125/774 | 3.2877e-06 | 11.33673 | |
| 7 | 20 | 90 | 91/69 | 1.3138e-06 | 0.106181 | 83 | 84/54 | 1.0093e-06 | 0.082832 |
| 200 | 517 | 518/392 | 4.5464e-06 | 0.926231 | 61 | 62/52 | 2.0797e-06 | 0.242663 | |
| 2000 | 326 | 327/294 | 2.2237e-06 | 218.2702 | 71 | 72/54 | 7.8519e-07 | 112.6556 | |
| 8 | 4 | 46 | 47/38 | 9.2635e-06 | 0.054172 | 47 | 48/34 | 4.1153e-07 | 0.076865 |
| 40 | * | */* | * | * | 354 | 355/265 | 9.1963e-06 | 0.147167 | |
| 9 | 40 | 6 | 7/7 | 1.1958e-06 | 0.054117 | 6 | 7/7 | 1.8900e-06 | 0.060258 |
| 400 | 6 | 7/7 | 1.6354e-07 | 0.111561 | 6 | 7/7 | 2.3374e-07 | 0.133484 | |
| 4000 | 11 | 12/12 | 8.4467e-07 | 40.15594 | 11 | 12/12 | 8.9545e-07 | 47.93941 | |
| 10 | 4 | 377 | 378/298 | 8.2689e-06 | 0.175454 | 81 | 82/58 | 4.6604e-07 | 0.089702 |
| 40 | * | */* | * | * | 1260 | 1261/910 | 5.7484e-06 | 0.391677 | |
| 11 | 4 | 70 | 71/37 | 2.9831e-06 | 0.073789 | 59 | 60/51 | 2.7230e-06 | 0.077290 |
| 40 | 192 | 193/133 | 4.2981e-06 | 0.108448 | 132 | 133/122 | 3.1390e-06 | 0.116485 | |
| 500 | * | */* | * | * | 1119 | 1120/1023 | 9.3082e-06 | 21.02191 | |
| 12 | 4 | 43 | 44/41 | 4.4263e-06 | 0.062761 | 48 | 49/43 | 5.2578e-06 | 0.068215 |
| 40 | 1977 | 1978/1315 | 8.1245e-06 | 0.369235 | 190 | 191/146 | 9.5513e-06 | 0.129097 | |
| 400 | * | */* | * | * | 351 | 352/252 | 8.4470e-06 | 4.848008 | |
| 13 | 4 | 35 | 36/16 | 8.9785e-06 | 0.053999 | 35 | 36/19 | 5.0442e-06 | 0.063276 |
| 40 | 359 | 360/263 | 9.3216e-06 | 0.140429 | 47 | 48/29 | 7.5584e-06 | 0.084719 | |
| 400 | 1996 | 1997/1400 | 9.7095e-06 | 14.38582 | 55 | 56/34 | 9.2260e-06 | 0.746511 | |
| 1000 | * | */* | * | * | 52 | 53/36 | 9.5547e-06 | 10.46032 | |
| 14 | 2 | 98 | 99/59 | 5.6830e-06 | 0.058219 | 91 | 92/52 | 2.5916e-07 | 0.089294 |
| 20 | 164 | 165/87 | 6.2362e-06 | 0.076377 | 125 | 126/98 | 9.9306e-06 | 0.094535 | |
| 200 | * | */* | * | * | 209 | 210/161 | 9.3905e-06 | 0.656179 | |
| 15 | 4 | 27 | 28/16 | 4.2390e-07 | 0.063120 | 23 | 24/15 | 8.2898e-06 | 0.061544 |
| 400 | 33 | 34/11 | 8.4956e-06 | 0.162408 | 35 | 36/15 | 7.7218e-06 | 0.202917 | |
| 1000 | 21 | 22/2 | 9.0840e-06 | 0.101637 | 21 | 22/2 | 9.0840e-06 | 0.160027 | |
| 4000 | 25 | 26/2 | 5.6751e-07 | 0.970886 | 25 | 26/2 | 5.6751e-07 | 2.331821 | |
| 16 | 4 | 19 | 20/16 | 1.4241e-06 | 0.051494 | 14 | 15/15 | 4.9068e-06 | 0.048488 |
| 40 | 518 | 519/329 | 7.5993e-06 | 0.117250 | 63 | 64/42 | 6.2298e-06 | 0.057708 | |
| 400 | * | */* | * | * | 60 | 61/48 | 4.6296e-06 | 0.571713 | |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Optimization Algorithms Research · Sparse and Compressive Sensing Techniques · Optimization and Variational Analysis
A new alternating direction trust region method based on conic model for solving unconstrained optimization
Honglan Zhu
Qin Ni
Chuangyin Dang
Department of Mathematics, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, People’s Republic of China.
Business School, Huaiyin Institute of Technology, Huaian 223003, People’s Republic of China.
Department of Systems Engineering & Engineering Management, City University of Hong Kong, Kowloon, Hong Kong SAR.
Abstract
In this paper, a new alternating direction trust region method based on conic model is used to solve unconstrained optimization problems. By use of the alternating direction method, the new conic model trust region subproblem is solved by two steps in two orthogonal directions. This new idea overcomes the shortcomings of conic model subproblem which is difficult to solve. Then the global convergence of the method under some reasonable conditions is established. Numerical experiment shows that this method may be better than the dogleg method to solve the subproblem, especially for large-scale problems.
keywords:
Unconstrained optimization , conic model , trust region method , alternating direction method , global convergence
††journal: Journal of LaTeX Templates
1 Introduction
In this paper, we consider the unconstrained optimization problem
[TABLE]
where is continuously differentiable. The problem (1.1) have been studied by many researchers, including Han [1], Powell [2], Yuan and Sun [3], Powell and Yuan [4], etc. There are many methods to solve problem (1.1), and trust region method is a very effective method (see [4, 5, 6, 7, 8, 9]). In addition, the book of Conn, Gould and Toint [10] is an excellent and comprehensive one on trust region methods. Most optimization theory is based on the quadratic model and uses the quadratic model to approximate . That is, at the th iteration, the following subproblem:
[TABLE]
is solved to obtain a search direction , where is the current iterate point, , is symmetric and an approximation to the Hessian of , refers to the Euclidean norm, is the trust region radius at the th iteration.
There are many methods can be used to solve the subproblem (1.2)-(1.3). The simple, low cost and effective methods are dogleg methods, such as Powell’s single dogleg method [11] and Dennis and Mei’s double dogleg method [12]. Then there are other scholars have studied the dogleg method [13, 14, 15]. Now, we recall the simple dogleg algorithm for solving trust region subproblem with the quadratic model as following algorithm.
Algorithm 1.1
Step 0. Input the data of the th iteration i.e., and .
Step 1. Compute . If , then , and stop.
Step 2. Compute . If , then , and stop. Otherwise, go to Step 3.
Step 3. Compute
[TABLE]
then , where .
We note that the solution of the subproblem obtained by dogleg methods is only an approximate solution of (1.2)-(1.3). Moreover, practice experience shows that the quadratic model is not always effective. If the objective function possesses high non-linear property and the iterative point is far away from the minimum, the quadratic model could not approximate the original problem very well, which may lead to iteration proceed slowly.
In 1980, Davidon [16] proposed the conic model for solving unconstrained optimization. It is an alternative model to substitute the quadratic model. And it has attracted wide attention of many authors in various areas [20, 21, 22, 17, 18, 23, 19]. A typical trust-region subproblem with conic model was first proposed by Di and Sun in [24] as following.
[TABLE]
where horizon vector , and is symmetric and positive semidefinite. In [25], Ni proposed a new trust region subproblem and gave the optimality conditions for the trust region subproblems of a conic model. That is, at the th iteration, the trial step is computed by solving the following conic model trust region subproblem
[TABLE]
where () is a sufficiently small positive number. The subproblem (1.7)-(1.8) considered more comprehensive than (1.5)-(1.6), and will not miss the solution of the original problem (1.1).
The research demonstrated that the conic model is superior to quadratic model to some extent, in particular, for those class of objective functions with highly vibrating; in addition, the conic model can supply enough freedom to make best use of both information of gradients and function values in iterate points. In view of this good properties of conic model, we will continue to study it.
It is noteworthy that the simple dogleg algorithm for solving trust region subproblem based on the conic model (DCTR) is similar to the above Algorithm 1.1, where
[TABLE]
However, the calculation of DCTR is much more complicated (see [26, 27, 28])
In order to find a simpler method and which is more suitable for the unique structure of the conic model, we considered to using the alternating directions method for solving the conic model subproblem. Alternating directions method (ADM) could date back to [29]. It has been well studied in the linearly constrained convex programming problems. Because of its significant efficiency and easy implementation, ADM has attracted wide attention of many authors in various areas, see [30, 31, 32, 33, 34, 35].
In this paper, we combine the subproblem (1.7)-(1.8) with alternating direction search method to propose a new method for solving the conic trust region subproblem. The rest of this paper is organized as follows. In the next section, the motivation and description of the simple alternating direction search algorithm are presented. In Section 3, we give the quasi-Newton method based on the conic model for solving unconstrained optimization problems and prove its global convergence properties. The numerical results in Section 4 indicate that the algorithm is efficient and robust.
2 A simple alternating direction search method
The conic model in the subproblem (1.7)-(1.8) has one more parameter than , so has more freedom which can take into account the information concerning the function value in the previous iteration which is useful for algorithms. Furthermore, the conic model possesses richer interpolation information and can satisfy four interpolation conditions of the function values and the gradient values at the current and the previous points. Using these rich interpolation information may improve the performance of the algorithms. Generally, the choice of the parameters is a descent direction, such as , or (see [16, 17, 18, 27, 26]).
In view of the unique importance of the parameters , we consider the following alternating direction search method to solve the subproblem (1.7)-(1.8). The new method is divided into two steps. First, we search along the direction parallel to . And then search along the direction which is perpendicular to . For convenience, we omit the index of and in this section.
In this paper, we assume that and is positive (abbreviated as ).
Let
[TABLE]
where and . Then, the solving process of subproblem (1.7)-(1.8) is divided into the following two stages.
In the first stage, we set and then . Substituting it into (1.7)-(1.8), we have
[TABLE]
where .
For the purpose of clarity, we denote
[TABLE]
Then,
[TABLE]
In the following, we consider three different cases of (2.2)-(2.3):
(1) If , then and (2.2)-(2.3) becomes
[TABLE]
(2) If , then and (2.2)-(2.3) becomes
[TABLE]
(3) If , then and (2.2)-(2.3) becomes
[TABLE]
Now, we discuss the stationary points of . By the direct computation, we have that the derivative of is
[TABLE]
where
[TABLE]
From (2.4), we know that and then from (2.5) . Therefore, if then has only one stationary point
[TABLE]
Lemma 2.1**.**
(1) If then and is monotonically decreasing in the in the trust region ; is monotonically increasing for and .
(2) If , then and is monotonically increasing for ; is monotonically decreasing for .
(3) If , then and is monotonically increasing in the trust region ; is monotonically decreasing for and .
Proof.
From (2.4) and (2.17), we know that if then
[TABLE]
Then, since , combining with (2.15) we can obtain that the lemma obviously holds. ∎
Theorem 2.1**.**
If then the optimal solution of the subproblem (P1), (P2) and (P3) is
[TABLE]
Proof.
If then from (2.2) we have
[TABLE]
Hence, the theorem holds. ∎
Theorem 2.2**.**
If , then the optimal solution of the subproblem (P1) is
[TABLE]
Proof.
For the subproblem (P1), we know that where .
(1) If , then from Lemma 2.1 (1)(2) we can easily obtain .
(2) If , then . From Lemma 2.1 (3), we can obtain that if then ; If , then . Therefore, .
(3) If , then . From Lemma 2.1 (3), we can obtain that if then ; If , then . Therefore, . ∎
Theorem 2.3**.**
If , then the optimal solution of the subproblem (P2) is
[TABLE]
Proof.
The proof process is similar to the above Theorem 2.2, so we omitted it. ∎
Theorem 2.4**.**
If , then , and the optimal solution of the subproblem (P3) is
[TABLE]
where
[TABLE]
Proof.
For the subproblem (P3), we know that
[TABLE]
where . If , then . And from Lemma 2.1 (1) we can easily obtain that and
[TABLE]
(1) If , then from (2.2) we have
[TABLE]
where
[TABLE]
Because , then from (2.4) and (2.17) we have
[TABLE]
And then
[TABLE]
Combining with (2.27), then
[TABLE]
Hence, .
(2) If , then from (2.2) we have
[TABLE]
Because , then
[TABLE]
Therefore, . The theorem is proved. ∎
Theorem 2.5**.**
If and , then the optimal solution of the subproblem (P3) is
[TABLE]
Proof.
(1) If then . Combining (2.25) and Lemma 2.1 (2), we know that
[TABLE]
However, by calculation we have
[TABLE]
For and , then
[TABLE]
Hence, and (2.36) holds.
(2) If then . Combining (2.25) and Lemma 2.1 (3), we know that the optimal solution of the subproblem (P3) is
[TABLE]
For , then from (2.39) we note that
[TABLE]
If , then from Lemma 2.1 (3) we know that
[TABLE]
Thus,
[TABLE]
Then, (2.36) holds.
(3) If , , then from (2.17) and (2.18) we can get . Combining (2.25) and Lemma 2.1 (3), we know that the optimal solution of the subproblem (P3) is
[TABLE]
For the subproblem (P3), we note that . Because of , then
[TABLE]
However, from and Lemma 2.1 (3) we can obtain that if then ; If then holds too. Therefore, it follows that
[TABLE]
Then, (2.36) holds too and the theorem is proved. ∎
If , then from (2.4) we know that . Therefore, for this case we set and exit the calculation of subproblem. Otherwise, we know that is inside the trust region. Then, we should carry out the calculation of the second stage below.
We set and substitute it into . And then the subproblem (1.7)-(1.8) becomes
[TABLE]
where
[TABLE]
In order to remove the equality constraint in (2.47), we use the null space technique. That is, for then there exist mutually orthogonal unit vectors orthogonal to the parameter vector . Set and , where . Then (2.46)-(2.47) can be simplified as following subproblem
[TABLE]
where
[TABLE]
Set , and . By Algorithm 1.1, we can obtain the solution of the subproblem (2.49)-(2.50). Then and . Thus, the subproblem (1.7)-(1.8) is solved approximately.
Now we could give the alternating direction search method for solving the conic trust region subproblem (1.7)-(1.8) as following.
Algorithm 2.1
Given and .
Step 1. If , then . Set and use Algorithm 1.1 to get , stop.
Step 2. Compute and by (2.4), (2.16) and (2.17).
Step 3. Compute .
Step 4. Solve the subproblem (2.2)-(2.3).
Step 4.1. If , then calculate by (2.21); If , then calculate by (2.22);
Otherwise, go to step 4.2.
Step 4.2. If then calculate by (2.23); If then calculate by (2.36);
Step 5. If , then , and stop. Otherwise, compute , , and by (2.48) and (2.51).
Step 6. Set , and . Then solve the subproblem (2.49)-(2.50) by Algorithm 1.1 to get .
Step 7. Set and , and stop.
In order to discuss the lower bound of predicted reduction in each iteration, we define the following predicted reduction.
[TABLE]
Now we should prove the following theorem to guarantee the global convergence of the algorithm proposed in the next section.
Theorem 2.6**.**
Under the same conditions as Lemma 2.1. If are obtained by Steps 5 in Algorithm 2.1, then there exists a positive constant such that
[TABLE]
Proof.
(1) If , then we know that . By computation, we have
[TABLE]
where is generated in two cases as defined in (2.21) and (2.23). In both cases, we can find and
[TABLE]
Then
[TABLE]
(1a) For , then from (2.21) we know that and . Combining with (2.55) and (2.56) , we have
[TABLE]
where
[TABLE]
(1b) For , then from (2.23) we know that and . Because of and , then from (2.56) we also have (2.57) holds.
(2) If , then
[TABLE]
where is generated in the following three cases as defined in (2.21)-(2.23) and (2.36).
(2a) For , then .
From (2.21), we know that if then . Thus,
[TABLE]
And then, from (2) we know
[TABLE]
On the other hand, if then . Then from (2.4) and (2.17) we have (2.60) holds too. It follows that (2) holds.
(2b) For , then .
Combining with (2.22), we can prove that (2.60) holds by the same way and
[TABLE]
(2c) For , then .
From (2.23), we know that if , then
[TABLE]
By the definition of in the (2.52), we get
[TABLE]
Combining with the proof of the above case (1a) in this theorem, we have
[TABLE]
Therefore, the theorem follows from (2.57) and (2)-(2) with
[TABLE]
∎
Theorem 2.7**.**
Under the same conditions as Lemma 2.1. If is obtained from the above Algorithm 2.1, then there exists a positive constant such that
[TABLE]
Proof.
(1) If is obtained by Algorithm 1.1, then from Nocedal and Wright [36] we have
[TABLE]
where .
(2) If , then (2.54) holds.
(3) , where . Combining with (2.52) and (2.53), we have
[TABLE]
Because of is obtained by Algorithm 1.1, then from [36] we have
[TABLE]
where , , and as defined by (2.48) and (2.51). Thus,
[TABLE]
where can be or .
(3a) If , then from (2.68) we have
[TABLE]
where the second equality is from (2.17) and the last equality is from (2.58).
(3b) If , then
[TABLE]
From (2.21)-(2.23) and (2.36), we know that and . For , then we have
[TABLE]
and
[TABLE]
where .
(3c) If , then
[TABLE]
From (2.21)-(2.23) and (2.36), we know that , and . For , then we have
[TABLE]
and
[TABLE]
Therefore, the theorem follows from (2.54), (2.66) and (2)-(2) with
[TABLE]
∎
3 The algorithm and its convergence
In this section, we propose a quasi-Newton method with a conic model for unconstrained minimization and prove its convergence under some reasonable conditions. In order to solve the problem (1.1), we approximate with a conic model of the form
[TABLE]
where , and are parameter vectors.
The choice of the parameters and in (3.1) can refer to [16, 17, 18, 26, 27] and [37, 38] respectively. We set
[TABLE]
[TABLE]
If , then
[TABLE]
otherwise, . In the updating process, we compute
[TABLE]
[TABLE]
where
[TABLE]
[TABLE]
and .
Let be the solution of the subproblem (1.7)-(1.8) by Algorithm 2.1. Then either is accepted as a new iteration point or the trust region radius is reduced according to a comparison between the actual reduction of the objective function
[TABLE]
and the reduction predicted by the conic model
[TABLE]
That is, if the reduction in the objective function is satisfactory, then we finish the current iteration by taking
[TABLE]
and adjusting the trust-region radius; otherwise the iteration is repeated at point with a reduced trust-region radius.
Now we give the alternating direction trust-region algorithm based on conic model (3.1).
**Algorithm 3.1 ** (ADCTR).
Step 0. Choose parameters , , and ; give a starting point , , and an initial trust region radius ; set .
Step 1. Compute and . If , then stop with as the approximate optimal solution; otherwise go to Step 2.
Step 2. Set , , and . Then solve the subproblem (1.7)-(1.8) by Algorithm 2.1 to get one of the approximate solution .
Step 3. Compute , and
[TABLE]
If , then set , and go to Step 2. If , then set and
[TABLE]
Step 4. Generate and ; set , and go to Step 1.
In this algorithm, the procedure of ”Step 2-Step 3-Step 2” is named as inner cycle. The following theorem guarantees that the ADCTR algorithm does not cycle infinitely in the inner cycle.
**Assumption 3.1. ** The level set
[TABLE]
and the sequence , and are all uniformly bounded, is symmetric and positive definite and is twice continuously differentiable in .
From (3.10) and Theorem 2.2, we have
[TABLE]
where as defined by (2.72).
Theorem 3.1**.**
Suppose that Assumption 3.1 holds. is the solution of conic trust-region subproblem (1.7)-(1.8). If the process does not terminate at , then we must have after a finite number of inner iterations.
Proof.
We assume that the algorithm does not terminate at , then there is such that
[TABLE]
From Assumption 3.1 we have
[TABLE]
For simplicity, we suppose that the superscript denotes the iterative step of inner iteration at , then
[TABLE]
Assume is a solution of subproblem (1.7)-(1.8) with trust-region radius , then it is easy to know that
[TABLE]
From (3.14), (3.15) and (3.17), we can obtain that there exist an integer and a constant such that
[TABLE]
It follows from (3.16) that
[TABLE]
On the other hand, from (3.17) and (3.15) we can get
[TABLE]
And then, from (3.15)-(3.20) we have
[TABLE]
where and . Combining with (3.18) and (3.22), we can get that
[TABLE]
holds for all . By (3.17) and (3.23),
[TABLE]
holds for all sufficiently large , which contradicts (3.19). This completes the proof. ∎
In the following we give the global convergence property of Algorithm 3.1.
Theorem 3.2**.**
Suppose that Assumption 3.1 holds. Then for any , the Algorithm 3.1 terminates in finite number of iterations, that is
[TABLE]
Proof.
We give the proof by contradiction. Suppose that there is such that
[TABLE]
Combining with (3.13), (3.15) and (3.25), we have
[TABLE]
where the first inequality of (3.26) follows from
[TABLE]
and the second inequality is from and
[TABLE]
From Steps 3 of Algorithm 3.1 and (3.26), we obtain that for all
[TABLE]
Since is bounded from below and , we have
[TABLE]
Combining with Theorem 3.1, we know that
[TABLE]
which implies that
[TABLE]
On the other hand, similar to the proof of (3.20)-(3.24) we can obtain
[TABLE]
where is sufficiently large. From Step 3 of Algorithm 3.1, it follows that
[TABLE]
which is a contradiction to (3.30). The theorem is proved. ∎
4 Numerical Tests
In this section, algorithm ADCTR is tested with some standard test problems from [26, 40]. The purpose of this paper is to propose a new method to solve the conic trust region subproblem, that is alternating direction method, so we performed algorithm ADCTR on a limited number of test problems. The names of the 16 test problems are listed in Table 1.
All the computations are carried out in Matlab R2015b on a microcomputer in double precision arithmetic. These tests use the same stopping criterion . The columns in the Tables have the following meanings: No. denotes the numbers of the test problems; is the dimension of the test problems; Iter is the number of iterations; is the number of function evaluations performed; is the number of gradient evaluations; is the final objective function value; is the Euclidean norm of the final gradient; CPU(s) denotes the total iteration time of the algorithm in seconds. The sign * means that when the number of iterations reaches 5000, the algorithm fails to stop. The parameters in these algorithms are
[TABLE]
The numerical results of algorithm ADCTR for 16 unconstrained optimization problems are listed in Table 2. We note that the optimal value of these test problems is . From Table 2, we can see that our algorithm can obtain the minimum value of the function after a finite number of iterations. And the corresponding minimum point is the stability point, which is also the optimal solution. Therefore, the performance of ADCTR is feasible and effective.
In order to analyze the effectiveness of our new algorithm, we compare ADCTR with the conic quasi-Newton trust region algorithm in which the subproblems are solved by the dogleg method (DCTR), see Zhu [26] and Lu [27]. As the dimensions of each test problem ranging from 2 to 4000, we have actually computed 48 numerical comparisons experiments and the numerical results are listed in Table 3. Analyzing the numerical results, we have the following conclusions: for the 16 problems, our algorithm ADCTR is better than the DCTR for 12 tests, is somewhat bad for 2 tests, and the two algorithms are same in efficiency for the other 2 tests; our algorithm in which the subproblems are solved by alternating direction method is competitive with algorithm DCTR in [26]. Especially for large-scale problems, our new algorithm has a strong numerical stability.
5 Conclusions
In this paper, we propose an alternating direction trust region method based on the conic model for unconstrained optimization and investigate its convergence. Conic models are more flexible to approximate objective functions and have stronger modeling property. Alternating direction method (ADM) has been well studied in the context of linearly constrained convex programming problems. It is because of the significant efficiency and easy implementation of ADM that we consider applying it to solving the trust region subproblem based on the conic model. Initial numerical results show that our new method is competitive and it is also effective and robust for large-scale problems. The numerical results and the theoretical results lead us to believe that the method is worthy of further study.
In addition, the main purpose of this paper is to explore a new method for solving the conic model subproblem. Therefore, there are many aspects worthy of further improvement and research in this paper. For example, we can consider the weak convergence assumptions that the Hessian approximations is symmetric and positive semidefinite. The rate of convergence has not been studied.
Acknowledgements
We are grateful to the editors and referees for their suggestions and comments. This work was supported by National Natural Science Foundation of China (11771210) and the Natural Science Foundation of Jiangsu Province (BK20141409).
References
- [1]
S. P. Han. A globally convergent method for nonlinear programming, Journal of Optimization Theory and Applications, 1977, 22(3):297–309.
- [2]
M. J. D. Powell. Variable Metric Methods for Constrained Optimization, Springer Berlin Heidelberg, 1983.
- [3]
Y. X. Yuan, W. Y. Sun, Conic Methods for Unconstrained Minimization and Tensor Methods for Nonlinear Equations, Science Press, Beijing, China, 1997.
- [4]
M. J. D. Powell, Y. X. Yuan, A trust region algorithm for equality constrained optimization, Mathematical Programming, 1990, 49(1):189–211.
- [5]
A. Vardi. A trust region algorithm for equality constrained minimization: Convergence properties and implementation, Siam Journal on Numerical Analysis, 1981, 22(3):575–591.
- [6]
P. T. Boggs, R. H. Byrd, R. B. Schnabel. A stable and efficient algorithm for nonlinear orthogonal distance regression, SIAM Journal on Scientific and Statistical Computing, 1987, 8(6):1052–1078.
- [7]
P. L. TOINT. Global convergence of a class of trust region methods for nonconvex minimization in hilbert space, IMA Journal of Numerical Analysis, 1988, 8(2):231–252.
- [8]
J. Z. Zhang, D. T. Zhu. Projected quasi-newton algorithm with trust region for constrained optimization, Journal of Optimization Theory and Applications, 1990, 67(2):369–393.
- [9]
M. El-Alem. A robust trust-region algorithm with a nonmonotonic penalty parameter scheme for constrained optimization, Siam Journal on Optimization, 1995, 5(2):348–378.
- [10]
A. R. Conn, N. I. M. Gould, P. L. Toint. Trust-region methods, Society for Industrial and Applied Mathematics, 2000.
- [11]
M. J. D. Powell, A hybrid method for nonlinear equations,In :Ph. D. Rabonowitz,Gordon and Breach, eds., Numerical Methods for Nonlinear Algebraic Equations, 1970, 87–114.
- [12]
J. E.,Dennis, H. H. W. Mei. Two new unconstrained optimization algorithms which use function and gradient values, Journal of Optimization Theory and Applications, 1979, 28(4):453–482.
- [13]
L. Zhang, Z. Q. Tang. The Hybrid Dogleg Method to Solve Subproblems of Trust Region. Journal of Nanjing Normal University, 2001, 24(1):28–32.
- [14]
J. Z. Zhang, X. J. Xu, D. T. Zhu. A nonmonotonic dogleg method for unconstrained optimization, SIAM Journal on Scientific and Statistical Computing, 1987, 8(6):1052–1078.
- [15]
Y. L. Zhao, C. X. X, R. B. Schnabel. A new trust region dogleg method for unconstrained optimization, Appl. Math. J. Chinese Univ. Ser. B, 2000, 15(1):83-92.
- [16]
W. C. Davidon, Conic approximations and collinear scalings for optimizers, Siam Journal on Numerical Analysis, 1980, 17(2):268–281.
- [17] Schnabel R. Conic methods for unconstrained minimization and tensor methods for nonlinear equations. Math Prog: The State of the Art, (eds. A. Bachem, M. Grötschel and B. Korte), Heidelberg: Springer-Verlag, 1982: 417–438.
- [18] Sorensen D C. Newton’s method with a model trust region modification. SIAM J Numer Analy, 1982, 19(2): 409–426.
- [19] Xu C X, Yang X Y. Convergence of conic quasi-Newton trust region methods for unconstrained minimization. Math Appl, 1998, 11(2): 71–76.
- [20] Y X Yuan. A review of trust region algorithms for optimization. ICIAM, 2000, 99(1): 271–282.
- [21] D M Gay. Computing optimal locally constrained steps. SIAM J Sci Stat Comput, 1981, 2(2): 186–197.
- [22] J M Peng, Y X Yuan. Optimality conditions for the minimization of a quadratic with two quadratic constraints. SIAM J Optim, 1997, 7(3): 579–594.
- [23]
W. Y. Sun, Y. X. Yuan. A conic trust-region method for nonlinearly constrained optimization, Annals of Operations Research, 2001, 103(1):175–191.
- [24] S. Di and W. Y. Sun,
A trust region method for conic model to solve unconstraind optimizaions,
Optimization Methods and Software, 1996, 6(4):237–263.
- [25] Q. Ni,
Optimality conditions for trust-region subproblems involving a conic model,
SIAM Journal on Optimization, 2005, 15(3):826–837.
- [26]
M. Zhu, Y. Xue, Z. F. Sheng. A quasi-newton type trust region method based on the conic model, Numerical Mathematics A Journal of Chinese Universities, 1995, 17(1):36–47.
- [27] Lu X P, Ni Q. A quasi-newton trust region method with a new conic model for the unconstrained optimization. Appl Math Comput, 2008, 204(1): 373–384.
- [28]
L. J. Zhao, W. Y. Sun. A conic affine scaling method for nonlinear optimization with bound constraints, 2013, 30(3):1-30.
- [29]
D. Gabay, B. Mercier. A dual algorithm for the solution of nonlinear variational problems via flnite-element approximations, Computer and Mathematics with Applications,1976, 2(1):17–40.
- [30]
G. Chen, M. Teboulle. A proximal-based decomposition method for convex minimization problems, Mathematical Programming, 1994, 64(1-3):81–101.
- [31]
J. Eckstein, M. Fukushima. Some reformulation and applications of the alternating direction method of multipliers, Large Scale Optimization: State of the Art, W. W. Hager etal eds., Kluwer Academic Publishers, 1994, 115–134.
- [32]
B. S. He, L. Z. Liao, D. Han, H. Yang. A new inexact alternating directions method for monontone variational inequalities, Mathematical Programming, 2002, 92(1):103–118.
- [33]
S. Kontogiorgis, R. R. Meyer. A variable-penalty alternating directions method for convex optimization, Mathematical Programming, 1998, 83(1):29–53.
- [34]
K. Zhang, J. S. Li, Y. C. Song, X. S. Wang. An alternating direction method of multipliers for elliptic equation constrained optimization problem, SCIENCE CHINA Mathematics, 2017, 60(2):361–378.
- [35]
M. H. Xu. Proximal Alternating Directions Method for Structured Variational Inequalities, Journal of Optimization Theory and Applications, 2007, 134(1):107–117.
- [36]
Jorge Nocedal, Stephen J. Wright. Numerical optimization. Science Press,Beijing, China, 2006.
- [37] Powell M J D. Algorithms for nonlinear constraints that use Lagrange functions. Math Prog, 1978, 14(1): 224–248
- [38] M. Al-Baali. Damped techniques for enforcing convergence of quasi-Newton methods. Optim Meth Softw, 2014, 29(5): 919–936
- [39]
H. Zhu, Q. Ni, M. L. Zeng. A quasi-newton trust region method based on a new fractional model, Numerical Algebra, Control and Optimization, 2015, 5(3):237–249.
- [40] More J J, Garbow B S, Hillstrom K E. Testing unconstrained optimization software. ACM Trans. Math. Software, 1981, 7(1): 17–41.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] S. P. Han. A globally convergent method for nonlinear programming, Journal of Optimization Theory and Applications, 1977, 22(3):297–309.
- 2[2] M. J. D. Powell. Variable Metric Methods for Constrained Optimization, Springer Berlin Heidelberg, 1983.
- 3[3] Y. X. Yuan, W. Y. Sun, Conic Methods for Unconstrained Minimization and Tensor Methods for Nonlinear Equations, Science Press, Beijing, China, 1997.
- 4[4] M. J. D. Powell, Y. X. Yuan, A trust region algorithm for equality constrained optimization, Mathematical Programming, 1990, 49(1):189–211.
- 5[5] A. Vardi. A trust region algorithm for equality constrained minimization: Convergence properties and implementation, Siam Journal on Numerical Analysis, 1981, 22(3):575–591.
- 6[6] P. T. Boggs, R. H. Byrd, R. B. Schnabel. A stable and efficient algorithm for nonlinear orthogonal distance regression, SIAM Journal on Scientific and Statistical Computing, 1987, 8(6):1052–1078.
- 7[7] P. L. TOINT. Global convergence of a class of trust region methods for nonconvex minimization in hilbert space, IMA Journal of Numerical Analysis, 1988, 8(2):231–252.
- 8[8] J. Z. Zhang, D. T. Zhu. Projected quasi-newton algorithm with trust region for constrained optimization, Journal of Optimization Theory and Applications, 1990, 67(2):369–393.
