Pointwise and ergodic convergence rates of a variable metric proximal ADMM
Max L.N. Goncalves, Jefferson G. Melo, M. Marques Alves

TL;DR
This paper establishes the first global pointwise and ergodic convergence rates for a variable metric proximal ADMM, advancing understanding of its efficiency in solving linearly constrained convex optimization problems.
Contribution
It introduces a novel convergence analysis for VM-PADMM, including nonasymptotic rates, by linking it to a new VM-HPE framework for monotone inclusions.
Findings
Achieves $ ext{O}(1/\sqrt{k})$ pointwise convergence rate.
Achieves $ ext{O}(1/k)$ ergodic convergence rate.
First to establish these rates for VM-PADMM and VM-HPE framework.
Abstract
In this paper, we obtain global pointwise and ergodic convergence rates for a variable metric proximal alternating direction method of multipliers(VM-PADMM) for solving linearly constrained convex optimization problems. The VM-PADMM can be seen as a class of ADMM variants, allowing the use of degenerate metrics (defined by noninvertible linear operators). We first propose and study nonasymptotic convergence rates of a variable metric hybrid proximal extragradient (VM-HPE) framework for solving monotone inclusions. Then, the above-mentioned convergence rates for the VM-PADMM are obtained essentially by showing that it falls within the latter framework. To the best of our knowledge, this is the first time that global pointwise (resp. pointwise and ergodic) convergence rates are obtained for the VM-PADMM (resp. VM-HPE framework).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Pointwise and ergodic convergence rates of a variable metric proximal ADMM
M.L.N. Gonçalves Instituto de Matemática e Estatística, Universidade Federal de Goiás, Campus II- Caixa Postal 131, CEP 74001-970, Goiânia-GO, Brazil. (E-mails: [email protected] and [email protected]). The work of these authors was supported in part by CNPq Grants 406250/2013-8, 444134/2014-0 and 309370/2014-0.
M. Marques Alves
Departamento de Matemática, Universidade Federal de Santa Catarina, Florianópolis, Brazil, 88040-900 ([email protected]). The work of this author was partially supported by CNPq grants no. 406250/2013-8, 306317/2014-1 and 405214/2016-2.
J.G. Melo 11footnotemark: 1
(May 4, 2017)
Abstract
In this paper, we obtain global pointwise and ergodic convergence rates for a variable metric proximal alternating direction method of multipliers (VM-PADMM) for solving linearly constrained convex optimization problems. The VM-PADMM can be seen as a class of ADMM variants, allowing the use of degenerate metrics (defined by noninvertible linear operators). We first propose and study nonasymptotic convergence rates of a variable metric hybrid proximal extragradient (VM-HPE) framework for solving monotone inclusions. Then, the above-mentioned convergence rates for the VM-PADMM are obtained essentially by showing that it falls within the latter framework. To the best of our knowledge, this is the first time that global pointwise (resp. pointwise and ergodic) convergence rates are obtained for the VM-PADMM (resp. VM-HPE framework).
2000 Mathematics Subject Classification: 90C25, 90C60, 49M27, 47H05, 47J22, 65K10.
Key words: alternating direction method of multipliers, variable metric, pointwise and ergodic convergence rates, hybrid proximal extragradient method, convex program.
1 Introduction
We consider the linearly constrained convex optimization problem
[TABLE]
where and are extended-real-valued proper closed and convex functions, and are finite-dimensional real vector spaces, and and are linear operators. One of the most popular methods for solving (1) is the alternating direction method of multipliers (ADMM) [4, 14, 15], for which many variants have been proposed and studied in the literature; see, e.g., [1, 3, 7, 9, 10, 11, 12, 13, 17, 18, 19, 21, 25, 31].
In this paper, we obtain global ergodic and pointwise convergence rates for a variable metric proximal ADMM (VM-PADMM) which can be described as follows: given an initial point and a stepsize , compute a sequence , recursively, by
[TABLE]
where , and are selfadjoint linear operators such that is positive definite and and are positive semidefinite, and , etc. We start by reviewing some existing methods and works related to the above method.
VM-PADMM and some variants. The VM-PADMM (2)–(4) can be seen as a class of ADMM variants, depending on the choices of the linear operators , and . Namely,
- •
by taking with , , and , it reduces to the standard ADMM, whose the ergodic convergence rate was established in [30];
- •
the ADMM in [21] (related to the Uzawa method [38]) consists of taking with , constant, and . Pointwise and ergodic convergence rates for this variant were obtained in [21, 22];
- •
the proximal ADMM consists of choosing with , and constant. This method has been studied by many authors; see, for instance [8, 10, 16, 35], where convergence rates are analyzed;
- •
by choosing , and , it corresponds to a variable penalty parameter ADMM, for which asymptotic convergence analysis was considered in [20, 23, 36];
- •
the VM-PADMM (2)–(4) with and positive definite is closely related to the method studied in [19, 26] for solving (point-to-point) continuous monotone variational inequality problems (in the setting of problem (1), it demands and to be continuously differentiable). We mention that, contrary to our analysis, the latter references consider the stepsize and do not present nonasymptotic convergence rates;
- •
by letting , , and , the resulting method becomes similar to Algorithm 7 in [2], where a composite structure of is considered and ergodic convergence rates were obtained under the additional conditions that in (1) and the dual solution set of (1) be bounded.
Contributions of the paper. We obtain an global convergence rate for an ergodic sequence associated to the VM-PADMM (2)–(4) with , which provides, for given tolerances , triples , and scalars such that
[TABLE]
in at most iterations, where and denote dual seminorms associated to the linear operators and , and is a scalar measuring the quality of the initial point. Moreover, we establish an pointwise convergence rate in which the inclusions in (5) are strengthened, in the sense that , and the bound on the number of iterations becomes . Our study is done by first establishing global ergodic and pointwise convergence rates for a variable metric hybrid proximal extragradient (VM-HPE) framework for finding zeroes of maximal monotone operators, and then by showing that the VM-PADMM (2)–(4) can be seen as an instance of the latter framework. To the best of our knowledge, this is the first time that global pointwise (resp. pointwise and ergodic) convergence rates are obtained for the VM-PADMM (2)–(4) (resp. VM-HPE framework). Besides, our analysis allows degenerate metrics (induced by positive semidefinite linear operators) which makes the VM-PADMM (2)–(4) (and the VM-HPE framework) more suitable for applications. We next briefly review some related works to the VM-HPE framework.
VM-HPE type frameworks. The VM-HPE framework proposed in this work is a generalization of a special instance of the HPE framework [37] allowing variations in the metric (induced by positive semidefinite linear operators) along the iterations. The iteration complexity of the HPE framework was first analyzed in [28] and subsequently applied to the study of several methods; see, for example, [24, 27, 29, 30]. An inexact variable metric proximal point type method was proposed in [32] but, contrary to our VM-HPE framework, it demands the metrics to be nondegenerate (induced by invertible linear operators). Moreover, the convergence analysis presented in [32] does not include nonasymptotic convergence rates.
Outline of the paper. Subsection 1.1 presents our notation and basic results. Section 2 introduces the VM-HPE framework and presents its nonasymptotic pointwise and ergodic convergence rates, whose proofs are postponed to Appendix A. Section 3 contains two subsections. In Subsection 3.1, we formally state the VM-ADMM (2)–(4) and presents its nonasymptotic pointwise and ergodic convergence rates. In Subsection 3.2, we obtain the convergence rates of the VM-ADMM by viewing it as an instance of the VM-HPE framework.
1.1 Basic results and notation
Let be a finite-dimensional real vector space with inner product and induced norm . Denote by (resp. ) the space of selfadjoint positive semidefinite (resp. definite) linear operators on . Each element induces a symmetric bilinear form on and a seminorm on . Since is symmetric and bilinear, the following hold, for all ,
[TABLE]
Moreover, each also induces a (extended) dual seminorm on defined by
[TABLE]
On the other hand, each induces an inner product and a norm on , etc.
Next two propositions, whose proofs are omitted, will be useful in this paper.
Proposition 1.1**.**
For every , we have and , where denotes the range of .
Let the partial order on be defined by
[TABLE]
Proposition 1.2**.**
Let and . If , then
[TABLE]
A set-valued mapping is said to be monotone if
[TABLE]
Moreover, is maximal monotone if it is monotone and, additionally, if is a monotone operator such that for every then . The inverse operator of is given by . Given , the -enlargement of a set-valued mapping is defined as
[TABLE]
Recall that the -subdifferential of a convex function is defined by for every . When , then is denoted by and is called the subdifferential of at . The operator is trivially monotone if is proper. If is a proper closed and convex function, then is also maximal monotone [34].
The following result is a particular case of the weak transportation formula in [6, Theorem 2.3] combined with [5, Proposition 2(i)].
Theorem 1.3**.**
Suppose is maximal monotone and let , for , be such that and define
[TABLE]
*Then, the following hold: *
- (a)
* and ;*
- (b)
if, in addition, for some proper closed and convex function , then .
2 A variable metric HPE framework
Consider the monotone inclusion problem
[TABLE]
where is a finite-dimensional inner product real vector space and is maximal monotone. Assume that the solution set of (9) is nonempty.
In this section, we propose a variable metric hybrid proximal extragradient (VM-HPE) framework for solving (9) and analyze its nonasymptotic convergence rates. The proposed framework finds its roots in the hybrid proximal extragradient (HPE) framework of [37], for which the iteration complexity was recently obtained in [28]. Our main results on pointwise and ergodic convergence rates for the VM-HPE framework are presented in Theorems 2.2 and 2.3, respectively. In Section 3, we will show how the VM-HPE framework can be used to analyze the nonasymptotic convergence of a VM-PADMM for solving linearly constrained convex optimization problems.
We begin by stating the VM-HPE framework.
A variable metric hybrid proximal extragradient (VM-HPE) framework
(0)
Let , and be given, and set .
(1)
Choose and find such that
(10)
(11)
(2)
Set and go to step 1.
end
Remarks. 1) Letting and in (10) and (11), respectively, we find that the sequences , and satisfy
[TABLE]
which is to say that in this case the VM-HPE framework reduces to a special case of the HPE framework (see pp. 2763 in [28]) with (in the notation of [28]) or, in other words, the VM-HPE framework is a generalization of a special case of the HPE framework in which variations in the metric are allowed along the iterations. 2) If the sequence is taken to be constant, then the VM-HPE framework reduces to a special case of the NE-HPE framework studied in [16]. 3) We also mention that a variable metric inexact proximal point method with relative error tolerance was proposed in [32] but, contrary to our framework, the method of [32] demands that every operator must be positive definite. Moreover, the convergence analysis presented in [32] does not include nonasymptotic convergence rates. The fact that the VM-HPE framework allows positive semidefinite operators will be crucial for viewing the VM-PADMM of Section 3 as a special instance of it.
From now on in this section, we assume the following condition to hold:
Assumption 2.1**.**
For the sequence generated by the VM-HPE framework, there exist , and, for each , such that and satisfy
[TABLE]
Remark. The above assumption (which is similar to condition (1.4) in [32]) is satisfied, for instance, if the sequence is taken to be constant and , in which case one can choose .
It is easy to check that Assumption 2.1 implies the existence of a constant such that and satisfy
[TABLE]
In the remaining part of this section, we present pointwise and ergodic convergence rates for the VM-HPE framework. These results will depend on the quantity:
[TABLE]
which measures the “quality” of the initial guess in the VM-HPE framework with respect to the solution set .
For technical reasons and for the convenience of the reader, the proofs of the next two theorems will be given in Appendix A.
Theorem 2.2**.**
(Pointwise convergence rate of the VM-HPE framework)*
Let , and be generated by the VM-HPE framework. Let also and be as in (13) and (14), respectively. Then, for every , and there exists such that*
[TABLE]
Remarks. 1) If in Assumption 2.1 (in which case ), then the upper bound in (15) with and reduces essentially to a special case of [16, Theorem 3.3(a)] (with and ). Additionally, if and , then the bound (15) becomes similar to the corresponding one in [28, Theorem 4.4(a)]. 2) For a given tolerance , Theorem 2.2 ensures that there exists an index
[TABLE]
such that
[TABLE]
In this case, can be interpreted as a -approximate solution of (9) with residual (see, e.g., [28] for the definition of a related concept). 3) Although may not be invertible, criterion (17) makes sense due to the fact that belongs to the image of (see (10)). Indeed, if , then (10) and Proposition 1.1 imply that , and hence it follows from (17) that is a solution of problem (9).
Before presenting the ergodic convergence of the VM-HPE framework, let us define the ergodic sequences , and associated to and as follows:
[TABLE]
Theorem 2.3**.**
(Ergodic convergence rate of the VM-HPE framework)*
Let , and be given as in (18) and be generated by the VM-HPE framework. Let also , and be as in (12), (13) and (14), respectively. Then, for every , we have and*
[TABLE]
where and .
Remarks.
- Similarly to the first remark after Theorem 2.2, Theorem 2.3 is also related to [16, Theorem 3.4] and [28, Theorem 4.7].
- For given tolerances , Theorem 2.3 ensures that in at most
[TABLE]
iterations there hold
[TABLE]
Note that (21), in terms of the dependence on , is better than the bound in (16) by a factor of but, on the other hand, since can be strictly positive, the inclusion in (22) is potentially weaker than the one in (17).
3 A variable metric proximal alternating direction method of multipliers
This section contains two subsections. In Subsection 3.1, we formally state the VM-PADMM (2)–(4) and present its nonasymptotic convergence rates. The main results are Theorems 3.2 and 3.3 in which pointwise and ergodic convergence rates are obtained, respectively. The proofs of the latter theorems are discussed separately in Subsection 3.2 by viewing the method as an instance of the VM-HPE framework and by applying the results of Section 2.
3.1 VM-PADMM and its convergence rates
Let , and be finite-dimensional real inner product vector spaces. Consider the convex optimization problem (1), i.e.,
[TABLE]
where the following assumptions are assumed to hold:
- (O1)
and are proper closed and convex functions;
- (O2)
and are linear operators and ;
- (O3)
the solution set of (23) is nonempty.
Under the above assumptions and standard constraint qualifications (see, e.g.,[33, Corollaries 28.2.2 and 28.3.1]), a vector is a solution of (23) if and only if there exists a (Lagrange multiplier) such that is a solution of
[TABLE]
Motivated by the above statement, we define
[TABLE]
which is assumed to be nonempty.
The convergence rates of the VM-PADMM (stated below) for solving (23) will be obtained by viewing the optimization problem (23) as the monotone inclusion (24), which is associated to a certain maximal monotone operator (see (48)) in , and by applying the results of the previous section.
Variable metric proximal alternating direction method of multipliers (VM-PADMM).
(0)
Let and be given, and set .
(1)
Choose , and and compute an optimal solution of the subproblem
(26)
and compute an optimal solution of the subproblem
(27)
(2)
Set
(28)
, and go to step (1).
end
Remarks. 1) As already mentioned in Section 1, the VM-PADMM can be regarded as a class of ADMM instances, allowing a unified study of different variants of ADMM. 2) An usual choice for the linear operator is , where plays the role of a penalty parameter. 3) The proximal terms in (26) and (27) defined by and , respectively, may have different roles. Namely, they can be used to regularize the subproblems in (26) and (27), making them strongly convex (when and are positive definite operators) and hence admitting unique solutions. Moreover, by a careful choice of these operators, subproblems (26) and (27) may become much easier to solve; for instance, if , then with and with eliminate the presence of quadratic forms associated to and in (26) and (27), respectively.
From now on in this section, the following conditions are assumed to hold:
Assumption 3.1**.**
For the sequences , and generated by the VM-PADMM, there exist , , , and, for each , such that , , and satisfy
[TABLE]
Analogously to condition (13), assumption 3.1 implies the existence of such that satisfies
[TABLE]
We mention that Assumption 3.1 is similar to Condition C in [19] but, contrary to the latter reference, none of the operators and is assumed to be positive definite.
Similarly to the previous section, the following quantity will be needed:
[TABLE]
where and are given in Step (0) of the VM-PADMM, , and are given in Assumption 3.1, and is defined in (25).
Next we present the two main results of this paper, whose proofs are given in Subsection 3.2.
Theorem 3.2**.**
(Pointwise convergence rate of the VM-PADMM)*
Let , , and be generated by the VM-PADMM and let*
[TABLE]
Let also and be as in (30) and (31), respectively. Then, there exists a parameter such that, for all ,
[TABLE]
and, for some ,
[TABLE]
*where . *
Remark. For a given tolerance , Theorem 3.2 guarantees the existence of triples , and operators , and (generated by the VM-PADMM) such that
[TABLE]
in at most
[TABLE]
iterations, where and are as in (30) and (31), respectively. The triple in (35) can be seen as a -approximate solution of the KKT system (24) with residual .
Before proceeding to present the ergodic convergence of the VM-PADMM we need to introduce its associated ergodic sequences. Let be generated by the VM-PADMM, let and be defined as in (32) and (33), respectively, and let the ergodic sequences associated to them be defined by
[TABLE]
Theorem 3.3**.**
(Ergodic convergence rate of the VM-PADMM)*
Let , and be generated by the VM-PADMM and let , , and be the ergodic sequences defined as in (37)–(39). Let also , , and be as in (29), (30) and (31), respectively. Then, there exists a parameter such that, for all , there hold ,*
[TABLE]
and
[TABLE]
*where and are as in Theorem 2.3 with and is as in Theorem 3.2. *
Remark. Given tolerances , Theorem 3.3 guarantees that there exist scalars , triples , and operators , and (generated by the VM-PADMM) such that
[TABLE]
in at most
[TABLE]
iterations, where and are as in Assumption 3.1, (30) and (31), respectively. Note that while the dependence on the tolerance in (44) is better than the corresponding one in (36) by a factor of , the inclusions in (43) are potentially weaker than the corresponding ones in (35). The triple in (43) can be seen as a -approximate solution of the KKT system (24) with residual .
3.2 Proof of Theorems 3.2 and 3.3
The main goal of this subsection is to prove Theorems 3.2 and 3.3 by viewing the VM-PADMM as an instance of the VM-HPE framework of Section 2 for solving (9) with defined by
[TABLE]
where is endowed with the usual inner product of vectors :
[TABLE]
The desired results will then follow essentially from Theorems 2.2 and 2.3, and from the identity
[TABLE]
where and are the solution sets defined in (9) and (25), respectively. The following linear operators will be needed in our analysis:
[TABLE]
where , and are generated by the VM-PADMM and , , are given in Assumption 3.1.
We begin by presenting a preliminary technical result.
Proposition 3.4**.**
*Let be generated by the VM-PADMM and let be defined as in (32). Let also be defined as in (54). Then, *
[TABLE]
Proof.
From the first order optimality conditions for (26) and (27), we obtain, respectively,
[TABLE]
which, combined with (32), yields
[TABLE]
On the other hand, (28) (and the assumption ) gives
[TABLE]
Using (54), (56) and (57) we obtain (55). ∎
The next lemma will allow us to use the main results of Section 2 for analyzing the nonasymptotic convergence of the VM-PADMM.
Lemma 3.5**.**
The sequence defined in (54), the scalar and the sequence given in Assumption 3.1 satisfy condition (12) of Assumption 2.1.
Proof.
Note that the first condition in (29) is identical to the first one in (12). To finish the proof, note that the second condition in (29), which by Assumption 3.1 is assumed to hold for , and , combined with the (block) diagonal structure of gives the second condition in (12) for and . ∎
The following two technical results will be used to prove that the VM-PADMM is an instance of the VM-HPE framework.
Lemma 3.6**.**
*Let , and be generated by the VM-PADMM and let be defined as in (32). Let also be defined as in (31). Then, the following hold:
for any , we have
[TABLE]
(b)* we have*
[TABLE]
(c)* for any and , we have*
[TABLE]
Proof.
(a) This item follows trivially from (28) and (32).
(b) First note that
[TABLE]
which combined with the property (7) yields, for all ,
[TABLE]
Direct use of the above inequality and (54) yields
[TABLE]
where and . On the other hand, from Proposition 3.4 and (54) with , we have , where is given in (48). Using this fact, (50) and the monotonicity of , we obtain for all . Hence, from the latter inequality, Lemma A.1 with and , we have, for all ,
[TABLE]
Note now that letting , it follows from (54), item (a) and some direct calculations that
[TABLE]
Moreover, using (54) with and item (a), we find
[TABLE]
Combining the previous two estimates, we obtain
[TABLE]
If , then the last inequality implies that
[TABLE]
Now, if , we have
[TABLE]
where the second inequality is due to property (7), and the last inequality is due to (54) and definitions of and . Hence, combining the last estimative with (59), we obtain
[TABLE]
Thus, it follows from (59), (62) and the last inequality that
[TABLE]
Since, (see Assumption 3.1 and Lemma 3.5), the desired inequality follows from (58) and (63), and definition of in (31).
(c) Using the first order optimality condition for (27), (32) and item (a), we find, for every ,
[TABLE]
For any , using the above inclusion with and , the monotonicity of and the property (6), we find
[TABLE]
where the last inequality is due to Proposition 1.2 and Assumption 3.1, and so the proof of the lemma follows. ∎
Lemma 3.7**.**
For every , there exists a parameter such that, for all , the matrix
[TABLE]
is symmetric positive definite, and
[TABLE]
Proof.
Since is symmetric, the proof is immediate by noting that for and for every , is definite positive and (64) trivially holds. ∎
Next we show that the VM-PADMM can be regarded as an instance of the VM-HPE framework.
Proposition 3.8**.**
Let be generated by the VM-PADMM and let and be defined as in (32) and (54), respectively. Let also , , and be as in (31), (48), Lemma 3.7, and Theorem 3.2, respectively. Define , and, for all ,
[TABLE]
Then, for all ,
[TABLE]
*As a consequence, the VM-PADMM falls within the VM-HPE framework (with input , and ) for solving (9) with as in (48). *
Proof.
First note that the inclusion in (67) follows from (48), (55) and the definitions of , and in (65). Now, using (49), (54), (65) and some direct calculations, we obtain
[TABLE]
Using the same reasoning and Lemma 3.6(a), we also find
[TABLE]
Hence, from Lemma 3.6(a) and some algebraic manipulations, we obtain
[TABLE]
which in turn, combined with (3.2) and (69), yields
[TABLE]
We will now consider two cases: and . In the first case, it follows from (3.2) with , Lemma 3.6(b), the first inequality in (64) with , and definitions of and that
[TABLE]
where the last inequality is due to . Hence, since , inequality (67) for now follows from the second inequality in (64) with . On the other hand, assuming , from inequality (3.2), Lemma 3.6(c) with , the first inequality in (64) with , and definition of in (66), we have
[TABLE]
Since (see Assumption 3.1), we obtain from (64) with that the term inside bracket is nonnegative. Hence, inequality (67) for now follows from the first statement of Lemma 3.7.
The last statement of the proposition follows directly from (67) and VM-HPE framework’s definition. ∎
We are now ready to prove Theorems 3.2 and 3.3.
Proof of Theorem 3.2: Using Proposition 3.8 and Theorem 2.2, we conclude that, for every , (33) holds and there exists such that
[TABLE]
where and are defined in (54) and (65), respectively. Hence, using Proposition 1.1, we obtain
[TABLE]
On the other hand, using Proposition 1.1 and the definition in (33), we find
[TABLE]
which, combined with (71) and (3.2), proves (34). ∎
Proof of Theorem 3.3: Combining Proposition 3.8 and Theorem 2.3, and taking into account that , we conclude that, for every ,
[TABLE]
[TABLE]
On the other hand, (33), (37) and (38) yield
[TABLE]
Additionally, (37), (38) and some algebraic manipulations give
[TABLE]
Hence, combining the identity in (74) with the last two displayed equations, we also find
[TABLE]
where the last equality is due to the definitions of and in (39). Therefore, the inequalities in (41) and (42) now follows from (73) and (74), respectively.
To finish the proof of the theorem, note that direct use of Theorem 1.3(b) (for and ), (33) and (37)–(39) give and (40). ∎
Appendix A Proof of Theorems 2.2 and 2.3
We start by presenting the following two Lemmas.
Lemma A.1**.**
For any and , we have
[TABLE]
Proof.
Direct calculations yield
[TABLE]
∎
Lemma A.2**.**
Let , , and be generated by the VM-HPE framework. For every and
- (a)
we have
[TABLE]
- (b)
we have
[TABLE]
where and are as in (13) and Assumption 2.1, respectively.
Proof.
(a) From Lemma A.1 with and , (10) and (11), we obtain
[TABLE]
Hence, (a) follows from the above inequality, the fact that and (see (10)), and the monotonicity of .
(b) Using (a), (8) and Assumption 2.1, we find
[TABLE]
Thus, the result follows by applying the above inequality recursively and by using (13). ∎
We are now ready to prove Theorem 2.2.
Proof of Theorem 2.2: First, note that the desired inclusion holds due to (10). Now, using (7) and (11), we obtain, respectively,
[TABLE]
Combining the above inequalities, we find
[TABLE]
which in turn, combined with Lemma A.2(b), yields
[TABLE]
for all . Hence, (15) follows from Proposition 1.1, (10), (14), (75) and the fact that . ∎
Before proceeding to the proof of the ergodic convergence of the VM-HPE framework, let us first present an auxiliary result.
Proposition A.3**.**
Let , and be generated by the VM-HPE framework and consider and as in (18). Then, for every ,
[TABLE]
where and are given in Assumption 3.1.
Proof.
Using Lemma A.1 with and , (10) and (11), we find, for every ,
[TABLE]
where the second inequality is due to the fact that . Hence, using Assumption 2.1 and simple calculations, we obtain
[TABLE]
Summing up the last inequality from to and using the definition of in (18), we have
[TABLE]
which clearly gives (76). ∎
Proof of Theorem 2.3: Note first that the desired inclusion and the first inequality in (20) follow from (10), (18) and Theorem 1.3(a). Take . Now, let us prove the second inequality in (20), which will follow by bounding the term in the right-hand side of (76). Note that, using the convexity of , inequality (7) and (18), we find
[TABLE]
From (13), we have for all . Hence, using Proposition 1.2, inequality (11), Lemma A.2(b) and (14), we find
[TABLE]
On the other hand, using (7), for all , Proposition 1.2, Lemma A.2(b) and (14), we obtain
[TABLE]
It follows from inequalities (77)–(A) and the fact that that
[TABLE]
which, combined with Proposition A.3 and the first condition in (12), yields
[TABLE]
Therefore, the second inequality in (20) now follows from definition of and simple calculus.
To finish the proof of the theorem, it remains to prove (19). Assume first that . Using (18) and simple calculus, we have
[TABLE]
From (13), we obtain and . Hence, it follows from Propositions 1.1 and 1.2 that
[TABLE]
Direct use of Proposition 1.1 yields
[TABLE]
Next step is to estimate the general term in the summation in (80). To do this, first note that using Assumption 2.1, we find
[TABLE]
and so
[TABLE]
From (13) and the last inequality in (83), we obtain, respectively, and . Hence, using Propositions 1.1 and 1.2, we have
[TABLE]
Again, from (13), we obtain and , and consequently
[TABLE]
Hence, using (13) and (A)–(A), we find
[TABLE]
Finally, using the definition of in (14), (80)–(82), (A) and Lemma A.2(b), we conclude that
[TABLE]
which gives (19) for the case . Note now that by (13), we have and so using Propositions 1.1 and 1.2, Lemma A.2(b), (14) and the second identity in (18) with , we find
[TABLE]
which in turn, combined with the fact that , gives (19) for . ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] H. Attouch and M. Soueycatt. Augmented Lagrangian and proximal alternating direction methods of multipliers in Hilbert spaces. Applications to games, PDE’s and control. Pac. J. Optim. , 5(1):17–37, 2008.
- 2[2] S. Banert, R. I. Bot, and E. R. Csetnek. Fixing and extending some recent results on the ADMM algorithm. Avaliable on http://www.arxiv.org .
- 3[3] B.He, H. Liu, Z. Wang, and X. Yuan. A strictly contractive peaceman–rachford splitting method for convex programming. SIAM J. Optim. , 24(3):1011–1040, 2014.
- 4[4] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. , 3(1):1–122, 2011.
- 5[5] R. S. Burachik, A. N. Iusem, and B. F. Svaiter. Enlargement of monotone operators with applications to variational inequalities. Set-Valued Anal. , 5(2):159–180, 1997.
- 6[6] R. S. Burachik, C. A. Sagastizábal, and B. F. Svaiter. ϵ italic-ϵ \epsilon -enlargements of maximal monotone operators: theory and applications. In Reformulation: nonsmooth, piecewise smooth, semismooth and smoothing methods (Lausanne, 1997) , volume 22 of Appl. Optim. , pages 25–43. Kluwer Acad. Publ., Dordrecht, 1999.
- 7[7] A. Chambolle and T. Pock. A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vis. , 40(1):120–145, 2011.
- 8[8] Y. Cui, X. Li, D. Sun, and K. C. Toh. On the convergence properties of a majorized ADMM for linearly constrained convex optimization problems with coupled objective functions. J. Optim. Theory Appl. , 169(3):1013–1041, 2016.
