On inexact relative-error hybrid proximal extragradient, forward-backward and Tseng's modified forward-backward methods with inertial effects
M. Marques Alves, Raul T. Marcavillaca

TL;DR
This paper introduces an inertial under-relaxed relative-error hybrid proximal extragradient method with convergence guarantees, extending to inertial forward-backward and Tseng's methods for structured monotone inclusions, under flexible assumptions.
Contribution
It develops a novel inertial under-relaxed HPE method with convergence analysis and applies it to advanced forward-backward algorithms for monotone problems.
Findings
Proves asymptotic convergence of the proposed method.
Establishes nonasymptotic iteration-complexity bounds.
Demonstrates effectiveness on structured monotone inclusion problems.
Abstract
In this paper, we propose and study the asymptotic convergence and nonasymptotic global convergence rates (iteration-complexity) of an inertial under-relaxed version of the relative-error hybrid proximal extragradient (HPE) method for solving monotone inclusion problems. We analyze the proposed method under more flexible assumptions than existing ones on the extrapolation and relative-error parameters. As applications, we propose and/or study inertial under-relaxed forward-backward and Tseng's modified forward-backward type methods for solving structured monotone inclusions.
Click any figure to enlarge with its caption.
Figure 1Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Optimization Algorithms Research · Optimization and Variational Analysis · Iterative Methods for Nonlinear Equations
On inexact relative-error hybrid proximal extragradient,
forward-backward and Tseng’s modified forward-backward methods with inertial effects
M. Marques Alves
Departamento de Matemática, Universidade Federal de Santa Catarina, Florianópolis, Brazil, 88040-900 ([email protected]). The work of this author was partially supported by CNPq grants no. 405214/2016-2 and 304692/2017-4.
Raul T. Marcavillaca
Departamento de Matemática, Universidade Federal de Santa Catarina, Florianópolis, Brazil, 88040-900 ([email protected]). The work of this author was partially supported by CAPES.
Abstract
In this paper, we propose and study the asymptotic convergence and nonasymptotic global convergence rates (iteration-complexity) of an inertial under-relaxed version of the relative-error hybrid proximal extragradient (HPE) method for solving monotone inclusion problems. We analyze the proposed method under more flexible assumptions than existing ones on the extrapolation and relative-error parameters. As applications, we propose and/or study inertial under-relaxed forward-backward and Tseng’s modified forward-backward type methods for solving structured monotone inclusions.
2000 Mathematics Subject Classification: 90C25, 90C30, 47H05.
Key words: inertial, relaxed, proximal point method, HPE method, pointwise, ergodic, iteration-complexity, forward-backward algorithm, Tseng’s modified forward-backward algorithm.
Introduction
Inertial proximal point-type algorithms for monotone inclusions gained a lot of attention in research recently (see, e.g., [3, 4] and the references therein). The first method of this type – the inertial proximal point (PP) method – for solving generalized equations with monotone operators was proposed and studied by Alvarez and Attouch in [2]. The intense research activity in the subject in the last years is in part due to its connections with fast first-order algorithms for convex programming (see, e.g., [3, 4, 5, 6, 7, 23]).
Since the inertial PP method of Alvarez and Attouch has been used as the hidden engine for the design and analysis of various first-order proximal algorithms with inertial effects, including inertial versions of ADMM, forward-backward and Douglas-Rachford algorithms (see, e.g., [3, 4, 10, 15, 16]), it is natural to attempt to design inexact versions of it. In [9], Bot and Csetnek proposed and studied the asymptotic convergence of an inertial version of the hybrid proximal extragradient (HPE) method of Solodov and Svaiter [30, 35]. The HPE method is an inexact PP algorithm for which, at each iteration, the corresponding proximal subproblems are supposed to be (inexactly) solved within a relative error criterion (this contrasts to the summable error criterion proposed by Rockafellar [34]).
In this paper, we propose and study the asymptotic convergence and nonasymptotic global convergence rates (iteration-complexity) of an inertial under-relaxed HPE method for solving monotone inclusions. The proposed method (Algorithm 1) differs from the existing inertial HPE-type method of Bot and Csetnek in the sense it is based on a different mechanism of iteration. Moreover, we prove its convergence and iteration-complexity under more flexible assumptions than those proposed in [9] on the extrapolation and relative-error parameters. As applications, we study inertial (under-relaxed) versions of the Tseng’s modified forward-backward and forward-backward algorithms (see Algorithms 3 and 4) for solving structured monotone inclusions problems.
The main contributions of this paper will be further discussed in Section 1.
This paper is organized as follows. In Section 1, we present some preliminaries and basic results, review some existing algorithms and discuss in detail the main contributions of this paper. The inertial under-relaxed HPE method (Algorithm 1) is presented in Section 2; the main results are Theorems 2.5 (asymptotic convergence), and 2.7 and 2.8 (iteration-complexity). Sections 3 is devoted to present and study the inertial versions of the Tseng’s modified forward-backward and forward-backward algorithms; the main results are Theorems 3.2 and 3.4. We finish the paper in Section 4 with some concluding remarks.
The inertial proximal point (PP) method is a modification of the Rockafellar’s PP method for which, at each iteration, past information is used to extrapolate the current iterate by an extrapolation factor . The method was proposed and studied by Alvarez and Attouch and since then it The convergence of the inertial PP method was proved in under the assumption that is within the range , which has become a standard assumption in the analysis of different variants and special instances of Alvarez–Attouch’s method.
The hybrid proximal extragradient (HPE) method of Solodov and Svaiter is an inexact PP algorithm for which, at each iteration, the proximal subproblems are supposed to be (inexactly) solved within a relative error criterion determined by a tolerance . When it reduces to the exact Rockafellar’s PP method and, on the other hand, has been successfully used in many applications. Recently, an inertial version of the HPE method was proposed and studied by Bot and Csetnek . In this case, the extrapolation factor depends on and it is close to zero for large values of .
In this paper, we propose an inertial under-relaxed HPE method, which generalizes the under-relaxed HPE method of Svaiter . In contrast to the inertial HPE method of Bot and Csetnek, we obtain convergence and iteration-complexity of the proposed algorithm for the extrapolation factor within the standard range at the price of performing an under-relaxed step with factor with (uniformly on ). Simple numerical experiments indicate that this strategy is promising even in the exact case, i.e., when . Beyond to that, we explicitly compute the corresponding under-relaxation factor to implement the method with extrapolation factor within the range ( when and ). As an application, we propose and study an (under-relaxed) inertial version of the forward-backward-forward method of Tseng. We show that our propsed version of the latter algorithm deals differently and better with parameters than existing ones.
1 Preliminaries, basic results and general notation
1.1 Problem statement
Let be a real Hilbert space and consider the general monotone inclusion problem (MIP) of finding such that
[TABLE]
as well as the structured MIP
[TABLE]
where and are (set-valued) maximal monotone operators on and is a (point-to-point) monotone operator which is either Lipschitz continuous or cocoercive (see Subsections 3.1 and 3.2 for the precise statement). Problems (1) and (2) appear in different fields of applied mathematics and optimization including convex optimization, signal processing, PDEs, inverse problems, among others (see, e.g.,[8, 19]). We mention that under mild conditions on the operators and , problem (2) becomes a special instance of (1) with .
In this paper, we propose and study the asymptotic convergence and the iteration-complexity of inertial under-relaxed versions of the hybrid proximal extragradient (HPE) method (Algorithm 1), and Tseng’s modified forward-backward (Algorithm 3) and forward-backward (Algorithm 4) methods for solving (1), and (2), respectively.
The main contributions of (as well as the most related works with) this paper will be discussed along the next subsections, the main contributions being further summarized in Subsection 1.5.
1.2 The Alvarez–Attouch’s inertial proximal point method
The proximal point (PP) method is an iterative scheme for seeking approximate solutions of (1). It was first proposed by Martinet [25] for solving monotone variational inequalities (with point-to-point operators) and further studied and developed by Rockafellar in his pioneering work [34]. In its exact formulation, an iteration of the PP method can be described by
[TABLE]
where is a stepsize parameter and is the current iterate.
The inertial PP method is a modification of (3) proposed and studied by Alvarez and Attouch in [2] as follows: for all ,
[TABLE]
where is a sequence of extrapolation parameters; note that if , then it follows that (6) reduces to the Rockafellar’s PP method (3). Inertial PP-type methods deserve a lot of attention in nowadays research due the possibility of extending this methodology to different practical algorithms and, in part, as we mentioned earlier, due to its connections with fast first-order methods in convex programming. Asymptotic (weak) convergence of generated in (6) to a solution of (1) was first obtained in [2] under the assumptions that and
[TABLE]
The above upper bound on has become standard in the analysis of inertial-like proximal algorithms (see, e.g., [15, 16, 23, 32]). It seems that (7) was first improved by Alvarez in [1, Proposition 2.5] in the setting of projective-proximal point-type methods and, more recently, by Attouch and Cabot in [4] with relaxation playing a central role. One of the main goals of this contribution is the analysis of an inertial under-relaxed HPE-type method under the assumption (actually more general than) (7) on ; see Assumption .
1.3 The hybrid proximal extragradient method of Solodov and Svaiter
It is of course important to design and study inexact versions of known (exact) numerical algorithms, and this also applies to (3). In [34], Rockafellar proved that if, at each iteration , is computed satisfying
[TABLE]
and is bounded away from zero, then converges (weakly) to a solution of (1). Many modern inexact versions of the PP method (3), as opposed to the summable error criterion (8), use relative error tolerances for solving the associated subproblems. The first methods of this type were proposed by Solodov and Svaiter in [35, 36] and subsequently studied in [29, 30, 31, 37, 38]. The key idea consists of observing that (3) can be decoupled as
[TABLE]
and then relaxing (9) within relative error tolerance criteria. Among these new methods, the HPE method [35] has been shown to be very effective as a framework for the design and analysis of many concrete algorithms (see, e.g., [9, 14, 17, 20, 21, 24, 26, 27, 28, 31, 35, 37, 38]). It can be described as follows: for all ,
[TABLE]
where . (see Subsection 1.6 for the general notation on -enlargements .) Note that if , then it follows that (12) reduces to the exact PP method (3). As we mentioned before, recently Bot and Csetnek [9] proposed and studied an inertial proximal-like algorithm which combines ideas from (6) and (12). They have proved asymptotic convergence of their method under the assumption on and , where is as in (12) and for all (cf. (7)). This condition enforces whenever . This would, in particular, degenerate the desired inertial effect in many important applications of HPE-type methods for which is known (experimentally) to be the best choice among all possible (see, e.g., [17, 18, 26, 27]).
In this paper, we propose an inertial under-relaxed HPE-type method (Algorithm 1) with guarantee of asymptotic convergence and iteration-complexity (both pointwise and ergodic) under the assumption (actually more general than) (7) on ; see Assumption . The price to pay is to perform, in addition to inertial, under-relaxed steps. On the other hand, the under-relaxed parameter is explicitly computed and, in the case of (7), , the latter lower bound being uniform on (see the third remark following Assumption ). We also emphasize that our algorithm is different of the corresponding one in [9], in the sense it is based on a different mechanism of iteration.
The main convergence results on Algorithm 1 are Theorems 2.5, 2.7 and 2.8. It seems it is the first time in the literature that global (ergodic) convergence rates are obtained for inertial-like proximal algorithms (see Theorem 2.8).
1.4 Forward-backward and Tseng’s modified forward-backward methods
With its roots in the projected gradient algorithm for convex optimization, the forward-backward method (see, e.g., [22, 33]) is one of the most popular numerical algorithms for solving the structured monotone inclusion problem (2), having numerous applications in modern applied mathematics (see, e.g., [8]). It can be described as follows: for all ,
[TABLE]
where is a stepsize parameter and is the current iterate. Under the assumption that is cocoercive and is within a certain range, it follows that the sequence generated in (13) is weakly convergent to a solution of (2) (see, e.g., [8]). In the seminal paper [41], Tseng proposed and studied the following modification of (13) – known as the Tseng’s modified forward-backward method: for all ,
[TABLE]
We clearly see that in contrast to (13), (16) performs an additional forward step to define the next iterate . This is crucial to obtain convergence under the (weaker than cocoercivity) assumption of Lipschitz continuity on (see, e.g., [8, 41]). Since both forward-backward and Tseng’s modified forward-backward methods are known to be special instances of the HPE method (12) for solving (1) with (see, e.g., [30, 35, 39]), we have managed to propose and/or study inertial under-relaxed versions of (13) and (16) – namely, Algorithms 3 and 4, respectively – as special instances of the proposed inertial under-relaxed HPE method (Algorithm 1). As a by-product of the results obtained for Algorithm 1, we prove their asymptotic convergence as well as their global pointwise and ergodic convergence rates/iteration-complexity (see Theorems 3.2 and 3.4). We discuss some existing inertial/relaxed variants of (13) and (16) as well as how they are related to Algorithms 3 and 4 in the remarks following them. We also emphasize that, since Algorithms 3 and 4 will be analyzed within the framework of Algorithm 1, they will automatically inherit all the possible benefits from the proposed policy of choosing the upper bound on the sequence of inertial parameters and the relaxation parameter (see Assumption (A), the remarks following it, and the remarks following Algorithms 3 and 4).
1.5 The main contributions of this work
We summarize the main contributions of this work are as follows:
- (i)
Asymptotic convergence and nonasymptotic global pointwise and ergodic convergence rates (iteration-complexity) of an inertial under-relaxed HPE method (Algorithm 1) for solving (1) under more flexible than existing assumptions on the choice of inertial and relative-error parameters (see Assumption (A) and the remarks following it). We show, in particular, that it is possible to assume the upper bound on the sequence of inertial parameters , which became standard in the analysis of inertial-type proximal algorithms, at the price of performing under-relaxed iterations with explicitly computed parameter , where the latter lower bound is uniform on the relative-error parameter . We also emphasize that, up to the authors knowledge, it is the first time in the literature that an iteration-complexity analysis is performed for inertial HPE-type methods (see Theorems 2.7 and 2.8) and it seems it is also the first time that ergodic iteration-complexity results are established for inertial proximal-type algorithms.
- (ii)
Asymptotic convergence and pointwise and ergodic iteration-complexity of inertial under-relaxed versions of the Tseng’s modified forward-backward method (Algorithm 3) and forward-backward method (Algorithm 4) for solving (2) under the assumption that is monotone and either Lipschitz continuous or cocoercive. Analogously to (i), in this case, the proposed methods also benefit from the more flexible than standard assumptions on the choice of inertial parameters (see Subsections 3.1 and 3.2 for a discussion).
1.6 General notation and basics on monotone operators and –enlargements
Let be a real Hilbert space with inner product and induced norm . The weak limit of a sequence in (whenever it exists) will be denoted by . A set-valued map is said to be a monotone operator if for all and . On the other hand, is maximal monotone if is monotone and its graph is not properly contained in the graph of any other monotone operator on . The inverse of is , defined at any by if and only if . The resolvent of a maximal monotone operator is and if and only if . The operator , where , is defined by .
For maximal monotone and , the -enlargement [11] of is the operator defined by
[TABLE]
Note that for all .
The following summarizes some useful properties of (see, e.g., [13, Lemma 3.1 and Proposition 3.4(b)]).
Proposition 1.1**.**
Let be set-valued maps. Then,
- (a)
If , then for every .
- (b)
* for every and .*
- (c)
* is monotone, if and only if .*
- (d)
* is maximal monotone, if and only if . *
- (e)
If is maximal monotone, is such that , for all , , and , then .
Proposition 1.2**.**
(see, e.g., [13, Lemma 3.1 and Proposition 3.4(b)])*
Assume is a sequence in such that for all . If, , and , then .*
Next we present the transportation formula for -enlargements.
Theorem 1.3**.**
(see, e.g., [12, Theorem 2.3])*
Suppose is maximal monotone and let , , for , be such that*
[TABLE]
and define
[TABLE]
Then, and
[TABLE]
The following well-known property will be also useful in this paper. For any and , we have
[TABLE]
2 An inertial under-relaxed hybrid proximal extragradient method
Consider the monotone inclusion problem (1), i.e., the problem of finding such that
[TABLE]
where is a maximal monotone operator on for which .
In this section, we propose and study the asymptotic convergence and nonasymptotic global convergence rates (iteration-complexity) of an inertial under-relaxed hybrid proximal extragradient (HPE) method (Algorithm 1) for solving (19). Regarding the iteration-complexity analysis, we consider the following notion of approximate solution for (19): given tolerances , find and such that
[TABLE]
Note that in (20) gives , i.e., in this case is a solution of (19) (for a more detailed discussion on (20), see, e.g., [30]).
The main results in this section are Theorems 2.5, 2.7 and 2.8. We refer the reader to the remarks and comments following each of the above mentioned theorems for a discussion regarding the contribution of each of them in the light of related results available in the current literature.
Algorithm 1**.**
An inertial under-relaxed HPE method for solving (19)**
Input: and and .
1:
for , do
2:
Choose and define
(21)
3:
Find and such that
(22)
4:
Define
(23)
Remarks.
- (i)
Algorithm 1 clearly combines the inertial proximal point (PP) and the HPE methods (6) and (12), respectively. It reduces to (6) when and . Indeed, in this case, using (22), (23) and Proposition 1.1(d), we find for all (cf. iteration – in [2]).
- (ii)
A similar inertial relaxed relative-error PP algorithm was proposed and analyzed by Alvarez in [1]. We emphasize that in contrast to Algorithm 1, the algorithm proposed by Alvarez is a projective-type algorithm (see, e.g., [36]) and it is based on a different mechanism of iteration.
- (iii)
Algorithm 1 generalizes the HPE method of Solodov and Svaiter [30] and (a special instance of) the under-relaxed HPE method of Svaiter [40]. Indeed, the HPE method is obtained by letting and , in which case , while the under-relaxed HPE method (with , in the notation of the latter reference) appears whenever in Algorithm 1.
- (iv)
As we mentioned in Subsection 1.3, an inertial HPE-type method was recently proposed and studied by Bot and Csetnek in [9]. We refer the reader to Subsection 1.3 for a discussion of the contributions of this paper in the light of the latter reference, regarding the HPE-type methods.
- (v)
We emphasize that, in contrast to the analysis presented in this work – see Theorems 2.7 and 2.8 –, in all cases of inertial-type algorithms which were mentioned in remarks (i)–(iv) no iteration-complexity analysis has been obtained.
- (vi)
Step 3 of Algorithm 1 does not specify how to compute and the triple satisfying (22), their computation depending on the instance of the method under consideration. In this regard, Proposition 3.3 shows, in particular, how the evaluation of a cocoercive (monotone) point-to-point operator naturally produces such triples.
The next three results, especially Proposition 2.3, will be important for proving the main results on convergence and iteration-complexity of Algorithm 1.
Proposition 2.1**.**
Let , and be generated by Algorithm 1 and define, for all ,
[TABLE]
where
[TABLE]
Then, for any ,
[TABLE]
Proof.
Using (22), (23) and Lemma A.2(b) we obtain
[TABLE]
Note now that from (23) and (22) we have
[TABLE]
which, in turn, gives
[TABLE]
On the other hand, (23) yields
[TABLE]
To finish the proof, note that (26) is a direct consequence of (24), (27)–(29) and (25). ∎
Lemma 2.2**.**
Let , and be generated by Algorithm 1 and let . Then, for all ,
[TABLE]
Proof.
From (21) we have and , which combined with the property (18) yield the desired identity. ∎
Proposition 2.3**.**
Let , and be generated by Algorithm 1 and let be as in (24). Let also and define
[TABLE]
Then, and
[TABLE]
i.e., the sequences , , and satisfy the assumptions of Lemma A.5.
Proof.
Using Lemma 2.2 with and (30) we obtain, for all ,
[TABLE]
which combined with Proposition 2.1 and the definition of in (30) yields (31). The identity follows from the fact that and the first definition in (30). ∎
Next we present the first result on the asymptotic convergence of Algorithm 1.
Theorem 2.4** (first result on the weak convergence of Algorithm 1).**
Let , and be generated by Algorithm 1. If the following holds
[TABLE]
and, additionally, , for all , then the sequence converges weakly to a solution of the monotone inclusion problem (19).
Proof.
Using Proposition 2.3, (32), the fact that for all and Lemma A.5, one concludes that (i) exist for every , and , which gives (ii) , where is as in (24). In particular, is bounded. Using (ii), (22)–(24) and the assumption for all , we find
[TABLE]
Now let be a weak cluster point of (recall that it is bounded). Note that it follows from (33) that is also a (weak) cluster point of and let be such that . Using (33) and the inclusion in (22) we obtain
[TABLE]
which, in turn, combined with Proposition 1.1(e) yields , and so the desired result follows from (i) and Lemma A.4. ∎
Remark. Condition (32) appeared for the first time in [2], and since then it has become a standard assumption in the asymptotic convergence analysis of different inertial PP-type algorithms. Next, we present a sufficient condition on the input parameters in Algorithm 1 to ensure (32) holds (see Theorems 2.5, 2.7 and 2.8).
Assumption (A): and satisfy the following (for some ):
[TABLE]
and
[TABLE]
where
[TABLE]
Remarks.
- (i)
Conditions (35)–(37) will be crucial to prove convergence and iteration-complexity of the algorithms presented and studied in this paper; see, e.g., Theorems 2.5, 2.7 and 2.8, and Section 3.
- (ii)
Note that by letting , which by the first remark following Algorithm 1 means that it reduces to an under-relaxed version of the (exact) Alvarez–Attouch’s inertial PP method, we obtain that (35)–(37) are now simply given by: , for all , and
[TABLE]
In particular, in this case, we have whenever in (35), which corresponds to the standard upper bound on which has been used in different works in the current literature (see Subsection 1.2 for a discussion). Hence, even in the setting of exact inertial PP methods, conditions (35)–(37) generalize the usual assumption (7). See Figure 1.
- (iii)
As we mentioned earlier, an inertial HPE-type method was proposed and studied by Bot and Csetnek in [9], where asymptotic convergence is proved under the assumption on . Note that, in this case, whenever . This contrasts to the conditions (35)–(37), which, in particular yield (uniformly on ) when in (35). This may become especially useful in numerical implementations of Algorithm 1, since has been usually employed in the recent literature on HPE-type methods (see, e.g., [17, 18, 26, 27]). Further, (35)–(37) allow the upper bound on to be chosen arbitrarily close to 1, at the price of performing under-relaxed steps with the explicitly computed as in (36). See Figure 1.
Theorem 2.5** (second result on the weak convergence of Algorithm 1).**
Under the Assumption on Algorithm 1, let be as in (25) and define the quadratic real function:
[TABLE]
Then, and, for every ,
[TABLE]
*As a consequence, it follows that under the assumption the sequence generated by Algorithm 1 converges weakly to a solution of the monotone inclusion problem (19) whenever for all . *
Proof.
Using (21), the Cauchy-Schwarz inequality and the Young inequality with and we find
[TABLE]
which combined with (31) and (24), and after some algebraic manipulations, yields
[TABLE]
where
[TABLE]
Define,
[TABLE]
where is as in (30). Using (39), the assumption that is nondecreasing (see (35)) and (41)–(43) we obtain, for all ,
[TABLE]
Note now that from (36) and Lemma A.3 we have
[TABLE]
which, in turn, combined with the definition of in (25), and after some algebraic calculations, gives
[TABLE]
The latter identity implies, in particular, that is either the smallest or the largest root of the quadratic function . Hence, from (35) and the fact that (see (37)) we obtain
[TABLE]
The above inequalities combined with (2) yield
[TABLE]
which, in turn, combined with (35) and the definition of in (43), gives
[TABLE]
Note now that (45), (35) and (43) also yield
[TABLE]
and so,
[TABLE]
Hence, (40) follows directly from (2), (47), the definition of in (43) and the definition of in (30). On the other hand, the second statement of the theorem follows from (40) and Theorem 2.4 (recall that for all ). ∎
Remark. A quadratic function similar to , as defined in (39), was also considered by Alvarez in [1]. As we mentioned in the second remark following Algorithm 1, the algorithm studied in the later reference is different of the corresponding algorithm presented in this work, namely Algorithm 1. Moreover, note that if , then (cf. [2]).
Corollary 2.6**.**
Under the Assumption on Algorithm 1, let and be as in (25) and (39), respectively, and let . Then, for all ,
[TABLE]
Proof.
Using Proposition 2.3 and Lemma A.5(a) we conclude that (85) holds with , and as in (24) and (30), which gives that the desired result follows from (85) and (40). ∎
Next we present the first result on nonasymptotic global convergence rates/iteration-complexity of Algorithm 1.
Theorem 2.7** (global pointwise convergence rate of Algorithm 1).**
Under the Assumption on Algorithm 1, let and be as in (25) and (39), respectively, and let denote the distance of to . Assume that for all . Then, for every , there exists such that
[TABLE]
Proof.
Let be such that . It follows from Corollary 2.6 that, for every , there exists such that
[TABLE]
which combined with the assumption and (22), and after some simple algebraic manipulations, yields the desired result. ∎
Remarks.
- (i)
Theorem 2.7 provides a global pontwise convergence rate and ensures, in particular, that for given tolerances , Algorithm 1 finds a triple satisfying (20) after performing at most
[TABLE]
iterations.
- (ii)
If and , in which case Algorithm 1 reduces to the HPE method of Solodov and Svaiter, then it follows that Theorem 2.7 reduces to [30, Theorem 4.4(a)].
- (iii)
Analogous global pontwise convergence rates were also obtained in [15, 16] for inertial-type algorithms for variational inequality and convex optimization problems.
In order to study the ergodic iteration-complexity of Algorithm 1, we need to define the following.
The aggregate stepsize sequence and the ergodic sequences , , associated to and , , and are, respectively, for ,
[TABLE]
Next we study the ergodic iteration-complexity of Algorithm 1 under the assumption that in (21).
Theorem 2.8** (global ergodic convergence rate of Algorithm 1).**
Under the Assumption on Algorithm 1 and, additionally, the assumption that , let , and be as in (51) and let denote the distance of to . Let also and be as in (25) and (39), respectively, and assume that for all .
Then, for all ,
[TABLE]
Proof.
Let be such that . Using Algorithm 1’s definition and Lemma A.2(a) with we find, for all ,
[TABLE]
On the other hand, Lemma 2.2 yields
[TABLE]
which, in turn, combined with (55) gives, for all ,
[TABLE]
where the sequence is as in (30). Summing the latter inequality over all and using (51) as well as the assumption , we obtain
[TABLE]
which combined with the definition of and (40) yields (recall that )
[TABLE]
where we have also used the inequality for all . Now, define
[TABLE]
From Corollary 2.6, the first definition in (57), (51), (23) and the convexity of we find
[TABLE]
[TABLE]
and
[TABLE]
From (58), (59), the convexity of and the inequality (for all ), we find
[TABLE]
Using the above inequality and (2) we obtain, for all ,
[TABLE]
Hence, (2), (58) with and , and (2) with yield
[TABLE]
which, combined with the assumption for all , clearly finishes the proof of (54).
Now note that using (23), (21) and the assumption we find
[TABLE]
Summing the above identity over and using (51) and (58) with and we find (recall that )
[TABLE]
which, combined with the assumption for all , yields (53). To finish the proof of the theorem, note that (52) is a direct consequence of the inclusion in (22) and Theorem 1.3(a). ∎
Remark. We mention that, up to the authors knowledge, this is the first time in the literature that global convergence rates are established for inertial PP-type algorithms.
2.1 On the under-relaxed inertial proximal point method
In this subsection, we analyze the convergence and iteration-complexity of the under-relaxed inertial proximal point (PP) method (see, e.g., [1, 4]) with constant under-relaxation (Algorithm 2) for solving (19). The analysis is performed by viewing Algorithm 2 within the framework of Algorithm 1, for which asymptotic convergence and iteration-complexity were obtained in Theorems 2.5, 2.7 and 2.8.
Algorithm 2**.**
Under-relaxed inertial proximal point method for solving (19)**
Input: and and .
1:
for , do
2:
Choose and define
(62)
3:
Compute
(63)
4:
Define
(64)
Proposition 2.9**.**
Algorithm 2* is a special instance of Algorithm 1 with in the Input, in which case and for all .*
Proof.
The proof follows from the well-known fact that if and only if and Algorithms 2 and 1’s definitions. ∎
Theorem 2.10** (convergence and iteration-complexity of Algorithm 2).**
Under the Assumption (A) with on Algorithm 2, let , , and be generated by Algorithm 2 and let the ergodic sequences , and be as in (51). Let also be as in (39) and let denote the distance of to . Assume that for all . Then, the following statements hold:
- (a)
The sequence converges weakly to a solution of the monotone inclusion problem (19).
- (b)
For all , there exists such that
[TABLE]
- (c)
If, additionally, , then, for all ,
[TABLE]
Proof.
The results in (a), (b) and (c) follow directly from Proposition 2.11 and Theorems 2.5, 2.7 and 2.8. ∎
Proposition 2.11**.**
The following statements hold:
- (a)
Algorithm 2* is a special instance of Algorithm 1 with in the Input, in which case , and .*
- (b)
The assertions of Theorems 2.5, 2.7 (Eqs. (48) and (49))* and 2.8 are valid, with , for Algorithm 2.*
Proof.
(a) This follows from the well-known fact that if and only if and Algorithms 2 and 1’s definitions.
(b) This follows trivially from Item (a). Note that, since , it follows that in this case (50) is irrelevant. ∎
Proposition 2.12**.**
Let , and be generated by Algorithm 2 and define, for all ,
[TABLE]
The triple and satisfy condition (22) with and condition (23). As a consequence, it follows that Algorithm 2 is a special instance of Algorithm 1 with input .
Theorem 2.13** (weak convergence of Algorithm 2).**
Let , and be generated by Algorithm 2 with input such that
[TABLE]
where
[TABLE]
Assume that is nondecreasing and for all . Then,
[TABLE]
As a consequence, under the above assumptions the sequence converges weakly to a solution of the monotone inclusion problem (19).
3 Inertial under-relaxed forward-backward and Tseng’s modified forward-backward methods
Consider the structured monotone inclusion problem (2), i.e., the problem of finding such that
[TABLE]
where is point-to-point monotone and is a (set-valued) maximal monotone operator for which (precise assumption on and will be stated later).
In this section, we study the convergence and iteration-complexity of inertial (under-relaxed) versions of the forward-backward and Tseng’s modified forward-backward methods (13) and (16), respectively, for solving (73), by viewing them within the framework of Algorithm 1, for which asymptotic convergence and iteration-complexity were studied in Section 2.
3.1 An inertial under-relaxed Tseng’s modified forward-backward method
In this subsection, we consider the monotone inclusion problem (73) where the following assumptions are assumed to hold:
- (C1)
is monotone and -Lipschitz continuous on a (nonempty) closed convex set such that , i.e., is monotone on and there exists such that
[TABLE]
- (C2)
is a (set-valued) maximal monotone operators on .
- (C3)
The solution set of (73) is nonempty.
We mention that it was proved in [29, Proposition A.1] that under assumptions – the operator defined in (73) is maximal monotone, which guarantee that (73) is a special instance of (19). In particular, it follows that Algorithm 1 can be used to solving the structured monotone inclusion (73).
As we mentioned earlier, in this subsection, we shall study the convergence and iteration-complexity of the following inertial under-relaxed version of the Tseng’s modified forward-backward method for solving (73).
Algorithm 3**.**
An inertial under-relaxed Tseng’s modified forward-backward method for solving (73)**
Input: , , and .
1:
for , do
2:
Choose and define
3:
Choose , let and compute
4:
Define
Remarks.
- (i)
Algorithm 3 reduces to the Tseng’s modified forward-backward method [41] for solving (73) if and , in which case and .
- (ii)
An inertial Tseng’s modified forward-backward-type method (based on a different mechanism of iteration) was proposed and studied in [9]. The proposed Tseng’s modified forward-backward type method in the latter reference tends to suffer from similar limitations as the inertial HPE-type method proposed in [9], as we discussed in the third remark following Assumption . Moreover, in contrast to this paper which performs the iteration-complexity analysis of Algorithm 3 (see Theorem 3.2), [9] has focused on asymptotic convergence.
Since the proof of the next proposition follows the same outline of [30, Proposition 6.1], we omit it here.
Proposition 3.1**.**
Let , , , , and be generated by Algorithm 3 and define
[TABLE]
Then, the sequences , , , , , and satisfy the conditions (21)–(23) in Algorithm 1. As a consequence, it follows that Algorithm 3 is a special instance of Algorithm 1 for solving (73).
Next we present the convergence and iteration-complexity of Algorithm 3 under the Assumption on the Input and on the sequence . We also mention that the observations regarding the parameter in the third remark following Assumption obviously apply to Algorithm 3.
Theorem 3.2** (convergence and iteration-complexity of Algorithm 3).**
Under the Assumption on and , let , and be generated by Algorithm 3, let and be as in (74) and let the ergodic sequences , and be as in (51). Let also and be as in (25) and (39), respectively, let denote the distance of to and assume that for all . Then, the following statements hold:
- (a)
The sequence converges weakly to a solution of the monotone inclusion problem (73).
- (b)
For all , there exists such that
[TABLE]
- (c)
If, additionally, , then, for all ,
[TABLE]
Proof.
The proof follows directly from Proposition 3.1 and Theorems 2.5, 2.7 and 2.8. ∎
Remarks.
- (i)
Itens (b) and (c) ensure, respectively, global pointwise and ergodic convergence rates for Algorithm 3. On the other hand, note that the inclusion in (LABEL:eq:seg088) is potentially weaker than the corresponding one in (75).
- (ii)
If in Step 3 of Algorithm 3, in which case , then in (75) and (LABEL:eq:seg088) can be replaced by . In this case, Item (b) gives that for a given tolerance , Algorithm 3 finds a pair such that (cf. (20))
[TABLE]
in at most
[TABLE]
iterations, an analogous remark also holding for Item (c).
3.2 On the inertial under-relaxed forward-backward method
Similarly to Subsection 3.1, in this subsection, we consider the monotone inclusion problem (73) but now assume the following: (C2) and (C3) as in Subsection 3.1 and instead of (C1):
- (C1*′*)
is cocoercive, i.e., there exists such that
[TABLE]
We observe that it follows from (77) that is, in particular, –Lipschitz continuous.
Algorithm 4**.**
Inertial under-relaxed forward-backward method for solving (73)**
Input: , , and .
1:
for , do
2:
Choose and define
3:
Choose and compute
4:
Define
Remarks.
- (i)
If and , then it follows that Algorithm 4 reduces to the forward-backward [22, 33] method for solving (73).
- (ii)
Inertial versions of the forward-backward method were previously proposed and studied in [32], [23] and [3]. Asymptotic convergence of the forward-backward method proposed in [23] was proved in the latter reference, in particular, under the assumption: , for all , and
[TABLE]
for some , where and ( in the notation of the present paper). The apparent limitation of this approach is that if , i.e., the inertial effect degenerates for large values of the stepsize (see Fig. 1 in [23]). This contrasts to the approach proposed in this paper, where the under-relaxation parameter is crucial to allowing sufficiently close to 1, even for large stepsize values, i.e., when (see Assumption (A) and part of the discussion in the third remark following it).
- (iii)
Algorithm 4 is a special instance (with constant relaxation) of the RIFB algorithm in [3]. We refer the reader to [3] (see, e.g., Theorems 3.8 and 3.15, and Remark 3.13) for a comprehensive discussion of the interplay and benefits of inertia and relaxation.
Next proposition shows that Algorithm 4 is also a special instance of Algorithm 1 for solving (73). Since the proof follows the same outline of [39, Proposition 5.3], we omit it here too.
Proposition 3.3**.**
Let , , and be generated by Algorithm 4, let be as in (73) and define, for all ,
[TABLE]
Then, the following hold for all :
[TABLE]
As a consequence of (LABEL:eq:0607) and Algorithm 4’s definition, it follows that Algorithm 4 is a special instance of Algorithm 1 for solving (73).
We finish this section by presenting the convergence and iteration-complexity of Algorithm 4, which are a direct consequence of Proposition 3.3 and Theorems 2.5, 2.7 and 2.8. We mention that analogous remarks to those made in the Remarks following Theorem 3.2 also apply here.
Theorem 3.4** (convergence and iteration-complexity of Algorithm 4).**
Under the Assumption on and , let , and be generated by Algorithm 4, let and be as in (78) and let the ergodic sequences , and be as in (51). Let also and be as in (25) and (39), respectively, let denote the distance of to and assume that for all . Then, the following statements hold:
- (a)
The sequence converges weakly to a solution of the monotone inclusion problem (73).
- (b)
For all , there exists such that
[TABLE]
- (c)
If, additionally, , then, for all ,
[TABLE]
4 Concluding remarks
In this paper, we proposed and studied the asymptotic convergence and iteration-complexity of an inertial under-relaxed HPE-type method. As applications, we proposed and/or studied inertial (under-relaxed) versions of the Tseng’s modified forward-backward and forward-backward methods for solving structured monotone inclusion problems with either Lipschitz continuous or cocoercive operators. All the proposed and/or studied algorithms, namely Algorithms 1, 2, 3 and 4 potentially benefit from a specific policy for choosing the upper bound on the sequence of extrapolation parameters, in which case (under) relaxation plays a central role (see Assumption and Theorems 2.7, 2.8, 3.2 and 3.4); see also the recent work [3] of Attouch and Cabot. We also emphasize that, up to the authors knowledge, this is the first time in the literature that nonasymptotic global convergence rates (iteration-complexity) are provided for inertial HPE-type methods, in particular for the proposed inertial Tseng’s modified forward-backward method.
Appendix A Auxiliary results
Lemma A.1**.**
([39, Lemma 2.2])*
Let be –cocoercive, for some , and let . Then,*
[TABLE]
Next lemma was proved in [40, Lemma 2.1]. Here, we present a short and direct proof for the convenience of the reader.
Lemma A.2** (Svaiter).**
Let and , and be such that
[TABLE]
Let and define . Then, the following hold:
- (a)
For any ,
[TABLE]
- (b)
For any ,
[TABLE]
Proof.
(a) Using the inequality in (82) and some algebraic manipulations we find, for any ,
[TABLE]
The fact that and (18) yield
[TABLE]
Multiplying (83) by and using the latter identity we obtain the desired inequality in (a).
(b) This is a direct consequence of Item (a), (17), the inclusion in (82) and the fact that . ∎
Lemma A.3**.**
For any , the inverse function of the scalar map
[TABLE]
is given by
[TABLE]
Lemma A.4** (Opial).**
Let and be a sequence in such that exist for every . If every (sequential) weak cluster point of belongs to , then converges weakly to a point in .
The following lemma was essentially proved by Alvarez and Attouch in [2, Theorem 2.1].
Lemma A.5**.**
Let the sequences , , and in and be such that , and
[TABLE]
The following hold:
- (a)
For all ,
[TABLE] 2. (b)
If , then exist, i.e., the sequence converges to some element in .
Proof.
It was proved in [2, Theorem 2.1] that , where Using this, the assumptions , and (84), and some algebraic manipulations we find
[TABLE]
which proves (a). To finish the proof of the lemma, note that (b) was proved inside the proof of [2, Theorem 2.1]. ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] F. Alvarez. Weak convergence of a relaxed and inertial hybrid projection-proximal point algorithm for maximal monotone operators in Hilbert space. SIAM J. Optim. , 14(3):773–782, 2003.
- 2[2] F. Alvarez and H. Attouch. An inertial proximal method for maximal monotone operators via discretization of a nonlinear oscillator with damping. Set-Valued Anal. , 9(1-2):3–11, 2001. Wellposedness in optimization and related topics (Gargnano, 1999).
- 3[3] H. Attouch and A. Cabot. Convergence of a relaxed inertial forward-backward algorithm for structured monotone inclusions. Preprint hal-01708216, 2018.
- 4[4] H. Attouch and A. Cabot. Convergence of a relaxed inertial proximal algorithm for maximally monotone operators. Preprint hal-01708905, 2018.
- 5[5] H. Attouch, Z. Chbani, J. Peypouquet, and P. Redont. Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity. Math. Program. , 168(1-2, Ser. B):123–175, 2018.
- 6[6] H. Attouch and J. Peypouquet. The rate of convergence of Nesterov’s accelerated forward-backward method is actually faster than 1 / k 2 1 superscript 𝑘 2 1/k^{2} . SIAM J. Optim. , 26(3):1824–1834, 2016.
- 7[7] H. Attouch, J. Peypouquet, and P. Redont. Fast convex optimization via inertial dynamics with Hessian driven damping. J. Differential Equations , 261(10):5734–5783, 2016.
- 8[8] H. H. Bauschke and P. L. Combettes. Convex analysis and monotone operator theory in Hilbert spaces . CMS Books in Mathematics/Ouvrages de Mathématiques de la SMC. Springer, New York, 2011. With a foreword by Hédy Attouch.
