Bregman Forward-Backward Operator Splitting††thanks: Contact
author: P. L. Combettes, [email protected],
phone: +1 (919) 515 2671.
This work was supported by the National
Science Foundation under grant DMS-1818946.
Minh N. Bùi and Patrick L. Combettes
North Carolina State University
Department of Mathematics
Raleigh
NC 27695-8205
USA
[email protected] and [email protected]
( )
Dedicated to Terry Rockafellar on the occasion of
his 85th birthday
Abstract.
We establish the convergence of the forward-backward splitting
algorithm based on Bregman distances for the sum of two monotone
operators in reflexive Banach spaces. Even in Euclidean spaces,
the convergence of this algorithm has so far been proved only in the
case of minimization problems. The proposed framework features
Bregman distances that vary over the iterations and
a novel assumption on the single-valued
operator that captures various properties scattered in the
literature. In the minimization setting, we obtain
rates that are sharper than existing ones.
Keywords.
Banach space,
Bregman distance,
forward-backward splitting,
Legendre function,
monotone operator.
1 Introduction
Throughout, X is a reflexive real Banach space with topological
dual X∗. We are concerned with the following monotone
inclusion problem (see Section 2.1 for notation and
definitions).
Problem 1.1
Let A:X→2X∗ and B:X→2X∗
be maximally monotone, let f∈Γ0(X) be
essentially smooth, and let Df be the Bregman
distance associated with f. Set C=(intdomf)∩domA
and S=(intdomf)∩zer(A+B). Suppose that
C⊂intdomB, S=∅,
B is single-valued on intdomB, and there exist
δ1∈[0,1[, δ2∈[0,1], and
κ∈[0,+∞[ such that
[TABLE]
The objective is to
[TABLE]
The central problem (1.2) has extensive connections with
various areas of mathematics and its applications. In Hilbert
spaces, if B is cocoercive, a standard method
for solving (1.2) is the forward-backward algorithm,
which operates with the update
xn+1=(Id+γA)−1(xn−γBxn)
[17]. This iteration is not applicable beyond
Hilbert spaces since A maps to X∗=X.
In addition, there has been a significant body of
work (see, e.g., [3, 6, 8, 12, 13, 16, 18, 19, 23])
showing the benefits of replacing standard distances by
Bregman distances, even in Euclidean spaces.
Given a sequence (γn)n∈N in ]0,+∞[ and
a suitable sequence of differentiable convex functions
(fn)n∈N, we propose to solve (1.2) via
the iterative scheme
[TABLE]
which consists of first applying a forward (explicit) step involving
B and then a backward (implicit) step involving A.
Let us note that the convergence of such an iterative process has
not yet been established, even in finite-dimensional spaces with a
single function fn=f and constant parameters
γn=γ. Furthermore, the
novel scheme (1.3) will be shown to unify and extend
several iterative methods which have thus far not been brought
together:
The Bregman monotone proximal point algorithm
[TABLE]
of [6] for finding a zero of A in intdomf, where
f is a Legendre function.
The variable metric forward-backward splitting method
[TABLE]
of [15] for finding a zero of A+B in a Hilbert space,
where (Un)n∈N is a sequence of strongly positive
self-adjoint bounded linear operators.
The splitting method
[TABLE]
of [18] for finding a minimizer of the sum of the convex
functions φ and ψ in intdomf.
The Renaud–Cohen algorithm
[TABLE]
of [20] for finding a zero of A+B in a Hilbert space,
where f is real-valued and strongly convex.
Problems which cannot be solved by algorithms
(1.4)–(1.7) will be presented in
Example 2.9 as well as in Sections 3.2 and
3.4.
New results on the minimization setting will be presented in
Section 3.3.
The goal of the present paper is to investigate the asymptotic
behavior of (1.3) under mild conditions on A, B, and
(fn)n∈N. Let us note that the convergence
proof techniques used in the above four frameworks do not extend
to (1.3). For instance, the tools of
[18] rely heavily on functional inequalities involving
φ and ψ. On the other hand, the approach of
[15] exploits specific properties of quadratic kernels
in Hilbert spaces, while [6] relies on Bregman
monotonicity properties of the iterates that will no longer hold
in the presence of B. Finally, the proofs of [20]
depend on the strong
convexity of f, the underlying Hilbertian structure, and the fact
that the updating equation is governed by a fixed operator.
Our analysis will not only capture these frameworks but also provide
new methods to solve problems beyond their reach. It hinges on the
theory of Legendre functions and the following new condition, which
will be seen to cover in particular various
properties such as the cocoercivity assumption used in the standard
forward-backward method in Hilbert spaces [7, 17], as
well as the seemingly unrelated assumptions used in
[6, 15, 18, 20] to study
(1.4)–(1.7).
The main result on the convergence of (1.3) is established
in Section 2 for the general scenario described in
Problem 1.1. Section 3 is dedicated to
special cases and applications. In the context of minimization
problems, convergence rates on the worst behavior of the method
are obtained.
2 Main results
2.1 Notation and definitions
The norm of X is denoted by ∥⋅∥ and the canonical
pairing between X and X∗ by
⟨⋅,⋅⟩. If X
is Hilbertian, its scalar product is denoted by
⟨⋅∣⋅⟩.
The symbols ⇀ and → denote
respectively weak and strong convergence. The set of weak
sequential cluster points of a sequence
(xn)n∈N in X is denoted by W(xn)n∈N.
Let M:X→2X∗ be a set-valued operator. Then
\operatorname{gra}M=\big{\{}{(x,x^{*})\in{\mathcal{X}}\times{\mathcal{X}}^{*}}~{}|~{}{x^{*}\in Mx}\big{\}} is the graph
of M, \operatorname{dom}M=\big{\{}{x\in{\mathcal{X}}}~{}|~{}{Mx\neq\varnothing}\big{\}} the domain of M,
\operatorname{ran}M=\big{\{}{x^{*}\in{\mathcal{X}}^{*}}~{}|~{}{(\exists\,x\in{\mathcal{X}})\,x^{*}\in Mx}\big{\}}
the range of M, and \operatorname{zer}M=\big{\{}{x\in{\mathcal{X}}}~{}|~{}{0\in Mx}\big{\}}
the set of zeros of M. Moreover, M is monotone if
[TABLE]
and maximally monotone if, furthermore, there exists no monotone
operator from X to 2X∗ the graph of which properly
contains graM.
A function f:X→]−∞,+∞] is coercive if
lim∥x∥→+∞f(x)=+∞ and supercoercive if
lim∥x∥→+∞f(x)/∥x∥=+∞.
Γ0(X) is the class of lower semicontinuous convex
functions f:X→]−∞,+∞] such that
\operatorname{dom}f=\big{\{}{x\in{\mathcal{X}}}~{}|~{}{f(x)<{{+}\infty}}\big{\}}\neq\varnothing.
Now let f∈Γ0(X). The conjugate of f
is the function f∗∈Γ0(X∗) defined by
f∗:X∗→]−∞,+∞]:x∗↦supx∈X(⟨x,x∗⟩−f(x)), and the subdifferential of f
is the maximally monotone operator
[TABLE]
In addition, f is a Legendre function if it is
essentially smooth in the sense that ∂f is
both locally bounded and single-valued on its
domain, and essentially strictly convex in the sense
that ∂f∗ is locally bounded on its domain and
f is strictly convex on every convex subset of dom∂f
[5].
Suppose that f is Gâteaux differentiable on
intdomf=∅. The Bregman distance associated with f is
[TABLE]
Given α∈]0,+∞[, we define
[TABLE]
2.2 On condition (1.1)
The following proposition provides several key illustrations of
the pertinence of (1.1) in terms of capturing concrete
scenarios.
Proposition 2.1
Consider the setting of Problem 1.1. Then (1.1)
holds in each of the following cases:
- (i)
δ1∈[0,1[, δ2=1, and
(∀x∈C)(∀y∈C)(∀z∈S)
⟨z−x,By−Bz⟩⩽κDf(x,y).
2. (ii)
δ1=0, δ2=1, and B=∂ψ,
where ψ∈Γ0(X) satisfies
[TABLE]
3. (iii)
δ1=0, δ2=1,
and there exists ψ∈Γ0(X) such that
B=∂ψ and (∀x∈C)(∀y∈C)
Dψ(x,y)⩽κDf(x,y).
4. (iv)
domB=X, there exists β∈]0,+∞[ such that
[TABLE]
f* is Fréchet differentiable on X,
∇f is α-strongly monotone on domA
for some α∈]0,+∞[,
ε∈]0,2β[,
κ=1/(α(2β−ε)), and
δ1=δ2=(2β−ε)/(2β).*
5. (v)
A+B* is strongly monotone with constant
μ∈]0,+∞[, B is Lipschitzian on domB=X
with constant ν∈]0,+∞[,
f is Fréchet differentiable on X,
∇f is α-strongly monotone on domA
for some α∈]0,+∞[,
ε∈]0,2μ/ν2[,
κ=ν2/(α(2μ−εν2)),
and δ1=δ2=(2μ−εν2)/(2μ).*
6. (vi)
domB=X, β∈]0,+∞[,
f is Fréchet differentiable on X,
∇f is α-strongly monotone on domA
for some α∈]0,+∞[,
ε∈]0,2β[,
κ=1/(α(2β−ε)), δ1=0,
δ2=(2β−ε)/(2β),
and one of the following is satisfied:
- [a]
B* is β-cocoercive, i.e.,*
[TABLE]
2. [b]
B* is ν-Lipschitzian for some ν∈]0,+∞[,
and angle bounded with constant 1/(4βν), i.e.,*
[TABLE]
3. [c]
B* is (1/β)-Lipschitzian and
there exists ψ∈Γ0(X) such that B=∇ψ.*
Proof. (i): Let x∈C, y∈C, and z∈S. Then
⟨y−x,By−Bz⟩=⟨z−x,By−Bz⟩+⟨y−z,By−Bz⟩⩽κDf(x,y)+⟨y−z,δ2(By−Bz)⟩.
In view of the monotonicity of A, we obtain (1.1).
(ii)⇒(i):
In the light of [9, Proposition 4.1.5 and
Corollary 4.2.5], ψ is Gâteaux differentiable
on intdomψ and B=∇ψ on
intdomψ=intdomB⊃C.
Hence, we derive from (2.5), (2.3),
and [6, Proposition 2.3(ii)] that
[TABLE]
(iii)⇒(ii): Clear.
(iv):
It results from [9, Theorem 4.2.10] that
∇f is continuous. Thus, using the strong monotonicity
of ∇f on domA, we obtain
[TABLE]
Given x and y in domA, define
ϕ:R→R:t↦f(y+t(x−y)), and observe that,
since domA is convex [24, Theorem 3.11.12],
[x,y]⊂domA and therefore (2.10) yields
[TABLE]
In turn, using (2.6) and (2.11), we deduce that
[TABLE]
(v)⇒(iv):
Set β=μ/ν2. Then
[TABLE]
(vi): We consider each case separately.
(vi)[a]:
By arguing as in (2.11), we obtain
(∀x∈domA)(∀y∈domA)
Df(x,y)⩾(α/2)∥x−y∥2.
It thus follows from (2.12) and (2.7) that
[TABLE]
(vi)[b]⇒(vi)[a]:
We derive from [1, Proposition 4] that
B is cocoercive with constant β.
(vi)[c]⇒(vi)[a]:
This follows from [1, Corollaire 10].
Remark 2.2
Condition (iv) in Proposition 2.1 first appeared in
[20] and does not seem to have gotten much notice in the
literature. The cocoercivity condition (vi)(vi)[a]
was first used in [17] to prove
the weak convergence of the classical
forward-backward method in Hilbert spaces.
Finally, in reflexive Banach space minimization
problems, (iii) appears in [18]; see also
[3] for the Euclidean case.
Remark 2.3
Condition (iii) is satisfied in particular
when X is a Hilbert space,
f=∥⋅∥2/2, domψ=X, and ∇ψ is Lipschitzian
[7, Theorem 18.15], in which case it is known as the
“descent lemma.”
Condition (ii) can be viewed as an extension
of this standard descent lemma involving triples (x,y,z) and an
arbitrary Bregman distance Df in reflexive Banach spaces.
Let us underline that (ii) is more general than (iii).
Indeed, consider the setting of Problem 1.1 with the
following additional assumptions: X is a Hilbert space,
0∈intdomf,
A is the normal cone operator of some self-dual cone K,
and there exists a Gâteaux differentiable convex function
ψ:X→R such that
[TABLE]
Then C=(intdomf)∩domA⊂K and S={0}.
Further, for every x∈C and every y∈C, (2.16)
yields Dψ(x,y)−Dψ(x,0)−Dψ(0,y)=⟨−x∣∇ψ(y)−∇ψ(0)⟩=⟨−x∣∇ψ(y)⟩⩽0⩽Df(x,y).
Therefore, (2.5) is satisfied. On the other hand,
(iii) does not hold in general. For instance, take
X=R, K=[0,+∞[, f=∣⋅∣2/2, and ψ=∣⋅∣3/2.
2.3 Forward-backward splitting for monotone inclusions
The formal setting of the proposed Bregman forward-backward
splitting method is as follows.
Algorithm 2.4
Consider the setting of Problem 1.1.
Let α∈]0,+∞[, let (γn)n∈N be in ]0,+∞[,
and let (fn)n∈N be in Cα(f).
Suppose that the following hold:
- [a]
infn∈Nγn>0,
supn∈N(κγn)⩽α, and
supn∈N(δ1γn+1/γn)<1.
2. [b]
There exists a summable sequence (ηn)n∈N in [0,+∞[
such that (∀n∈N) Dfn+1⩽(1+ηn)Dfn.
3. [c]
For every n∈N, ∇fn is strictly monotone on C and
(∇fn−γnB)(C)⊂ran(∇fn+γnA).
Take x0∈C and set (∀n∈N)
xn+1=(∇fn+γnA)−1(∇fn(xn)−γnBxn).
Let us establish basic asymptotic properties of
Algorithm 2.4, starting with the fact that its
viability domain is C.
Proposition 2.5
Let (xn)n∈N be a sequence generated by Algorithm 2.4
and let z∈S. Then (xn)n∈N is a
well-defined sequence in C and the following hold:
- (i)
(Dfn(z,xn))n∈N* converges.*
2. (ii)
∑n∈N(1−κγn/α)Dfn(xn+1,xn)<+∞*
and ∑n∈N(1−κγn/α)Df(xn+1,xn)<+∞.*
3. (iii)
∑n∈N⟨xn+1−z,γn−1(∇fn(xn)−∇fn(xn+1))−Bxn+Bz⟩<+∞.
4. (iv)
∑n∈N(1−δ2)⟨xn−z,Bxn−Bz⟩<+∞.
5. (v)
Suppose that one of the following is satisfied:
- [a]
C* is bounded.*
2. [b]
f* is supercoercive.*
3. [c]
f* is uniformly convex.*
4. [d]
f* is essentially strictly convex with domf∗ open and
∇f∗ weakly sequentially continuous.*
5. [e]
X* is finite-dimensional and domf∗ is open.*
6. [f]
f* is essentially strictly convex and ρ=x∈intdomfy∈intdomfx=yinfDf(y,x)Df(x,y)∈]0,+∞[.*
Then (xn)n∈N is bounded.
Proof. Take n∈N, and suppose that
(y∗,y1) and (y∗,y2) belong to
gra(∇fn+γnA)−1.
Then y∗∈(∇fn+γnA)y1 and
y∗∈(∇fn+γnA)y2. However,
by virtue of condition [c] in Algorithm 2.4,
∇fn+γnA is strictly monotone. Therefore,
since ⟨y1−y2,y∗−y∗⟩=0, we infer that y1=y2.
Hence
[TABLE]
Moreover, it follows from [9, Proposition 4.2.2] and
(2.4) that
[TABLE]
Next, we observe that, since x0∈C⊂intdomB,
∇f0(x0)−γ0Bx0 is a singleton.
Furthermore, in view of condition [c] in Algorithm 2.4,
∇f0(x0)−γ0Bx0∈ran(∇f0+γ0A). We thus deduce from (2.17) that
x1=(∇f0+γ0A)−1(∇f0(x0)−γ0Bx0)
is uniquely defined. In addition, (2.18) yields
x1∈ran(∇f0+γ0A)−1=C. The conclusion
that (xn)n∈N is a well-defined sequence in C
follows by invoking these facts inductively.
(i)–(iv):
Condition [a] in Algorithm 2.4 entails that there
exists ε∈]0,1[ such that
[TABLE]
Now take x0∗∈Ax0 and set
[TABLE]
In view of (2.20),
[TABLE]
In turn, since (z,−Bz)∈graA and A is monotone,
[TABLE]
Hence, invoking condition [a] in Algorithm 2.4
and the monotonicity of B, we obtain θn⩾0.
Next, since z∈intdomf=intdomfn by (2.4),
we derive from (2.20) and
[6, Proposition 2.3(ii)] that
[TABLE]
Thus, since (z,−Bz)∈graA and fn∈Cα(f),
we infer from (2.19), (2.22), (2.21),
and (1.1) that
[TABLE]
Consequently, by condition [b] in Algorithm 2.4
and (2.22),
[TABLE]
Hence, [7, Lemma 5.31] asserts that
[TABLE]
In turn, we infer from (2.20) and condition [a]
in Algorithm 2.4 that
[TABLE]
Thus, since (fn)n∈N lies in Cα(f), we obtain
∑n∈N(1−κγn/α)Df(xn+1,xn)<+∞.
It results from (2.26) and
(2.20) that (Dfn(z,xn))n∈N
converges.
(v):
Recall that (xn)n∈N lies in C.
(v)[a]: Clear.
(v)[b]:
We derive from (i) that (Df(z,xn))n∈N is bounded.
In turn, [5, Lemma 7.3(viii)] asserts that
(xn)n∈N is bounded.
(v)[c]:
It results from [24, Theorem 3.5.10] that
there exists a function
ϕ:[0,+∞[→[0,+∞] that vanishes only at [math] such that
limt→+∞ϕ(t)/t→+∞ and
[TABLE]
Hence, in the light of (i),
supn∈Nϕ(∥xn−z∥)⩽supn∈NDf(z,xn)⩽(1/α)supn∈NDfn(z,xn)<+∞
and (xn)n∈N is therefore bounded.
(v)[d]:
Suppose that there exists a subsequence
(xkn)n∈N of (xn)n∈N such that
∥xkn∥→+∞.
We deduce from [5, Lemma 7.3(vii)] and (i) that
[TABLE]
However, f∗ is a Legendre function by virtue of
[5, Corollary 5.5]
and ∇f(z)∈intdomf∗
by virtue of [5, Theorem 5.10].
Thus, [5, Lemma 7.3(v)]
guarantees that Df∗(⋅,∇f(z)) is coercive.
It therefore follows from (2.29) that
(∇f(xkn))n∈N is bounded,
and then from the reflexivity of X∗ that
W(∇f(xkn))n∈N=∅.
In turn, there exist a subsequence (xlkn)n∈N of
(xkn)n∈N and x∗∈X∗ such that
∇f(xlkn)⇀x∗.
The weak lower semicontinuity of f∗
and (2.29) yield
Df∗(x∗,∇f(z))⩽limDf∗(∇f(xlkn),∇f(z))<+∞. Therefore
[TABLE]
Moreover, [5, Theorem 5.10] asserts that
∇f∗(x∗)∈intdomf
and (\forall n\in\mathbb{N})\;\nabla f^{*}\big{(}\nabla f(x_{n})\big{)}=x_{n}.
Hence, (2.30)
and the weak sequential continuity of ∇f∗ imply that
xlkn=∇f∗(∇f(xlkn))⇀∇f∗(x∗). This yields
supn∈N∥xlkn∥<+∞ and we reach a contradiction.
(v)[e]:
A consequence of [5, Lemma 7.3(ix)] and (i).
(v)[f]:
It results from [5, Lemma 7.3(v)] that Df(⋅,z) is
coercive. In turn, since supn∈NDf(xn,z)⩽(1/ρ)supn∈NDf(z,xn)<+∞ by (i),
(xn)n∈N is bounded.
As seen in Proposition 2.5, by construction, an orbit of
Algorithm 2.4 lies in C and therefore in intdomf.
Next, we proceed to identify sufficient conditions that
guarantee that their weak sequential cluster points are also in
intdomf.
Proposition 2.6
Let (xn)n∈N be a sequence generated by Algorithm 2.4
and suppose that one of the following holds:
- [a]
domf∩domA⊂intdomf.
2. [b]
f* is essentially strictly convex with domf∗ open and
∇f∗ weakly sequentially continuous.*
3. [c]
f* is strictly convex on intdomf and ρ=x∈intdomfy∈intdomfx=yinfDf(y,x)Df(x,y)∈]0,+∞[.*
4. [d]
X* is finite-dimensional.*
Then W(xn)n∈N⊂intdomf.
Proof. Suppose that x∈W(xn)n∈N, say xkn⇀x,
and fix z∈S.
[a]:
Since domf is closed and convex, it is weakly
closed [10, Corollary II.6.3.3(i)]. Hence,
since Proposition 2.5 asserts that
(xn)n∈N lies in C⊂domf, we infer that
W(xn)n∈N⊂domf. Likewise,
since domA is a closed convex set
[24, Theorem 3.11.12] and
(xn)n∈N lies in C⊂domA, we obtain
W(xn)n∈N⊂domA. Altogether,
W(xn)n∈N⊂domf∩domA⊂intdomf.
[b]:
Using an argument similar to that of the proof of
Proposition 2.5(v)(v)[d],
we infer that there exist
a strictly increasing sequence (lkn)n∈N
in N and x∗∈intdomf∗ such that
xlkn⇀∇f∗(x∗).
Thus, appealing to [5, Theorem 5.10], we conclude that
x=∇f∗(x∗)∈intdomf.
[c]:
Proposition 2.5(i) and the weak lower semicontinuity
of Df(⋅,z) yield
[TABLE]
Thus x∈domf. We show that domf is open.
Suppose that there exists y∈domf∖intdomf,
let (αn)n∈N be a sequence in ]0,1[ such that
αn→1, and set
(∀n∈N) yn=αny+(1−αn)z. Then
{yn}n∈N⊂]y,z[⊂(intdomf)∖{z}
[10, Proposition II.2.6.16].
Moreover, yn→y and, by convexity of f,
(∀n∈N)
Df(yn,z)⩽αn(f(y)−f(z)−⟨y−z,∇f(z)⟩).
Hence
[TABLE]
However, it results from the lower semicontinuity of f that
limDf(yn,z)=lim(f(yn)−f(z))−lim⟨yn−z,∇f(z)⟩⩾f(y)−f(z)−⟨y−z,∇f(z)⟩=Df(y,z).
Hence, (2.32) forces
[TABLE]
In addition, by convexity of f,
(∀n∈N)Df(z,yn)⩾αn(f(z)−f(y)−⟨z−y,∇f(yn)⟩).
However,
[5, Theorem 5.6]
and the essential smoothness of f entail that
[TABLE]
Thus,
[TABLE]
It results from (2.33) and (2.35) that
0<ρ⩽limDf(yn,z)/Df(z,yn)=0, so that we reach a
contradiction. Consequently, domf is open and hence
x∈domf=intdomf.
[d]:
Proposition 2.5(i) ensures that
(xkn)n∈N is a sequence in intdomf
such that (Df(z,xkn))n∈N is bounded.
Therefore, [4, Theorem 3.8(ii)] and the essential
smoothness of f yield x∈intdomf.
Definition 2.7
Algorithm 2.4 is focusing if, for every z∈S,
[TABLE]
Our main result establishes the weak convergence of the
orbits of Algorithm 2.4.
Theorem 2.8
Let (xn)n∈N be a sequence generated by
Algorithm 2.4 and suppose that the following hold:
- [a]
(xn)n∈N* is bounded.*
2. [b]
W(xn)n∈N⊂intdomf.
3. [c]
Algorithm 2.4 is focusing.
4. [d]
One of the following is satisfied:
- 1/
S* is a singleton.*
2. 2/
There exists a function g in Γ0(X) which is
Gâteaux differentiable on intdomg⊃C, with
∇g strictly monotone on C, and such that,
for every sequence (yn)n∈N in C and every
y∈W(yn)n∈N∩C,
ykn⇀y ⇒
∇fkn(ykn)⇀∇g(y).
Then (xn)n∈N converges weakly to a point in
S.
Proof. It results from [a] and the reflexivity of X that
[TABLE]
On the other hand, [c] and items
(i)–(iv) in
Proposition 2.5 yield W(xn)n∈N⊂zer(A+B).
In turn, it results from [b] that
[TABLE]
In view of [7, Lemma 1.35] applied in Xweak,
it remains to show that W(xn)n∈N is a singleton.
If [d][d]1/ holds, this follows from (2.38).
Now suppose that [d][d]2/ holds,
and take y1 and y2 in W(xn)n∈N, say
xkn⇀y1 and xln⇀y2. Then
y1∈S and y2∈S
by virtue of (2.38), and we therefore
deduce from Proposition 2.5(i) that
(Dfn(y1,xn))n∈N and
(Dfn(y2,xn))n∈N converge.
However, condition [b] in Algorithm 2.4
and [7, Lemma 5.31] assert that
(Dfn(y1,y2))n∈N converges.
Hence, appealing to [6, Proposition 2.3(ii)],
it follows that
(⟨y1−y2,∇fn(xn)−∇fn(y2)⟩)n∈N=(Dfn(y2,xn)+Dfn(y1,y2)−Dfn(y1,xn))n∈N
converges. Set ℓ=lim⟨y1−y2,∇fn(xn)−∇fn(y2)⟩. Since (xn)n∈N is a sequence in
C, we infer from (2.38) and [d][d]2/ that
ℓ←⟨y1−y2,∇fln(xln)−∇fln(y2)⟩→⟨y1−y2,∇g(y2)−∇g(y2)⟩=0, which yields ℓ=0.
However, invoking [d][d]2/, we obtain
ℓ←⟨y1−y2,∇fkn(xkn)−∇fkn(y2)⟩→⟨y1−y2,∇g(y1)−∇g(y2)⟩. It therefore follows that
⟨y1−y2,∇g(y1)−∇g(y2)⟩=0 and hence from
the strict monotonicity of ∇g on C that y1=y2.
Example 2.9
We provide an example with operating conditions that are not
captured by any of the methods described in
(1.4)–(1.7). Let p∈]1,+∞[, let
(χn)n∈N be a sequence in [1,+∞[ such
that χn→1, and let (ηn)n∈N be a summable
sequence in [0,+∞[ such that (∀n∈N)
χn+1⩽(1+ηn)χn. We denote by
z=(ζk)k∈N
a sequence in ℓp(N). Set X=ℓp(N)×R,
hence X∗=ℓp/(p−1)(N)×R, and define the
Legendre functions
[TABLE]
and
[TABLE]
Now let ψ:X→[0,+∞[:(z,ξ)↦∥z∥p/p, set
B=∇ψ, and let A:X→2X∗ be any maximally
monotone operator such that
[TABLE]
Let us check that this setting conforms to that of
Theorem 2.8.
First, Proposition 2.1(iii) implies that
(1.1) is satisfied with δ1=0 and
δ2=κ=1. Next, we note that
intdomf=ℓp(N)×]0,+∞[, that (fn)n∈N lies in
C1(f), and that condition [b] in Algorithm 2.4
holds. Furthermore, we derive from (2.39) that
[TABLE]
and we observe that
[TABLE]
It therefore follows from the Brézis–Haraux theorem
[11, Théorème 4] that
[TABLE]
and hence that condition [c] in Algorithm 2.4
holds. It remains to verify condition [d][d]2/ in
Theorem 2.8.
Set φ:ℓp(N)→[0,+∞[:z↦∥z∥p/p and
(∀n∈N) φn:ℓp(N)→[0,+∞[:z↦χn∥z∥p/p.
Take a sequence
(zn,ξn)n∈N in domA and a point
(z,ξ)∈domA such that (zn,ξn)⇀(z,ξ).
We have ξn→ξ and (∀k∈N)
ζn,k→ζk. Now let
(ek)k∈N be the canonical Schauder basis of ℓp(N).
Then
[TABLE]
and (∇φn(zn))n∈N is bounded. It therefore
follows from [2, Théorème VIII-2] that
∇φn(zn)⇀∇φ(z) and, in turn,
that ∇fn(zn,ξn)⇀∇g(z,ξ) by
(2.40) and (2.42).
Note that the above setting is not covered by the assumptions
underlying (1.4)–(1.7):
the fact that B=0 excludes [6],
the fact that X is not a Hilbert space excludes
[15] and [20],
and [18] is excluded because
A is not a subdifferential.
3 Special cases and applications
We illustrate the general scope of Theorem 2.8 by
recovering apparently unrelated results and also by deriving
new ones. Sufficient conditions for
[a] and [b] in Theorem 2.8 to hold can be
found in Propositions 2.5(v) and 2.6,
respectively. As to checking the focusing condition [c], the
following fact will be useful.
Lemma 3.1
[13, Proposition 2.1(iii)]*
Let M1:X→2X∗ and M2:X→2X∗ be
maximally monotone, let (an,an∗)n∈N be a sequence
in graM1, let (bn,bn∗)n∈N be a sequence in
graM2, let x∈X, and let y∗∈X∗.
Suppose that an⇀x, bn∗⇀y∗,
an∗+bn∗→0, and an−bn→0. Then
x∈zer(M1+M2).*
3.1 Recovering existing frameworks for monotone inclusions
In this section, we show that the existing results of
[6, 15, 20] discussed in the Introduction can be
recovered from Theorem 2.8. As will be clear from the proofs,
more general versions of these results can also be derived at once
from Theorem 2.8.
First, we derive from Theorem 2.8 the convergence of the
Bregman-based proximal point algorithm (1.4) studied in
[6, Section 5.5].
Corollary 3.2
Let A:X→2X∗ be maximally monotone, let
f∈Γ0(X) be a supercoercive Legendre function such that
∅=zerA⊂domA⊂intdomf and ∇f
is weakly sequentially continuous, and let
(γn)n∈N be a
sequence in ]0,+∞[ such that infn∈Nγn>0.
Suppose that, for every bounded sequence (yn)n∈N
in intdomf,
[TABLE]
Take x0∈C and set (∀n∈N)
x_{n+1}=(\nabla f+\gamma_{n}A\big{)}^{-1}(\nabla f(x_{n})).
Then (xn)n∈N converges weakly to a point in zerA.
Proof. We apply Theorem 2.8 with B=0, α=1,
κ=δ1=δ2=0, and (∀n∈N) fn=f.
First, (1.1) together with
conditions [a] and [b] in Algorithm 2.4
are trivially fulfilled. On the other hand,
since f is a Legendre function and domA⊂intdomf,
condition [c] in Algorithm 2.4
follows from [6, Theorem 3.13(iv)(d)].
Next, condition [a] in Theorem 2.8 follows from
Proposition 2.5(v)(v)[b]. Furthermore,
in view of the weak sequential continuity of ∇f,
condition [d][d]2/ in Theorem 2.8
is satisfied with g=f. Next, to show that the algorithm
is focusing, suppose that
∑n∈NDf(xn+1,xn)<+∞
and take x∈W(xn)n∈N, say xkn⇀x.
Since (xn)n∈N is a bounded sequence in intdomf,
we derive from (3.1) that
∇f(xn+1)−∇f(xn)→0.
In turn, since infn∈Nγn>0,
it follows that γn−1(∇f(xn+1)−∇f(xn))→0. However, by construction,
(∀n∈N) γkn−1−1(∇f(xkn−1)−∇f(xkn))∈Axkn. Therefore, upon invoking Lemma 3.1 (with
M1=A and M2=0),
we obtain x∈zerA and the algorithm is therefore focusing.
This also shows that W(xn)n∈N⊂zerA⊂intdomf. Condition [b] in Theorem 2.8
is thus satisfied.
The next application of Theorem 2.8 is a variable metric
version of the Hilbertian forward-backward method
(1.5) established in [15, Theorem 4.1].
Corollary 3.3
Let X be a real Hilbert space, let A:X→2X
be maximally monotone, let α and β be in ]0,+∞[,
and let B:X→X satisfy
[TABLE]
Further, for every n∈N, let Un:X→X be
a bounded linear operator which is α-strongly monotone
and self-adjoint. Suppose that zer(A+B)=∅ and that
there exists a summable sequence (ηn)n∈N in [0,+∞[
such that
[TABLE]
Let ε∈]0,2β[
and let (γn)n∈N be a sequence in ]0,+∞[
such that 0<infn∈Nγn⩽supn∈Nγn⩽(2β−ε)α. Define a sequence
(xn)n∈N via the recursion
[TABLE]
Then (xn)n∈N converges weakly to a point in
zer(A+B).
Proof. Set f=∥⋅∥2/2, C=domA, and
S=zer(A+B). In addition, for every n∈N,
define fn:X→R:x↦⟨x∣Unx⟩/2.
Let us apply Theorem 2.8 with
κ=1/(2β−ε), δ1=0, and
δ2=(2β−ε)/(2β)∈]0,1[.
First, f∈Γ0(X) is a
supercoercive Legendre function with
domf=X and, for every n∈N,
since ∇fn=Un is α-strongly monotone,
fn∈Cα(f).
Furthermore, it follows from
Proposition 2.1(vi)(vi)[a]
that (1.1) is fulfilled.
We also observe that condition [a]
in Algorithm 2.4 is satisfied.
Next, by (3.3) and the assumption that the operators
(Un)n∈N are self-adjoint,
[TABLE]
and condition [b] in Algorithm 2.4 therefore
holds. Now take n∈N. Since ∇fn=Un is
maximally monotone with dom∇fn=X and A is maximally
monotone, [7, Corollary 25.5(i)] entails that
∇fn+γnA is maximally monotone.
Thus, since ∇fn+γnA is α-strongly
monotone, [7, Proposition 22.11(ii)] implies that
ran(∇fn+γnA)=X and it follows that
condition [c] in Algorithm 2.4 is satisfied.
Next, in view of
Proposition 2.5(v)(v)[b], (xn)n∈N
is bounded, while W(xn)n∈N⊂X=intdomf.
Now set μ=supn∈N∥Un∥.
For every n∈N, since it results from
(3.3) and [7, Fact 2.25(iii)] that
[TABLE]
we derive from [7, Fact 2.25(iii)] that
∥Un∥⩽∥U0∥∏k∈N(1+ηk). Hence μ<+∞
and therefore, appealing to [14, Lemma 2.3(i)], there
exists an α-strongly monotone self-adjoint bounded linear
operator U:X→X such that
(∀w∈X) Unw→Uw. Define
g:X→X:x↦⟨x∣Ux⟩/2. Then ∇g=U
is strongly monotone (and thus strictly monotone). Furthermore,
given (yn)n∈N in C and y∈W(yn)n∈N∩C,
say ykn⇀y, we have
[TABLE]
and thus ∇fkn(ykn)⇀∇g(y).
Therefore, condition [d][d]2/ in Theorem 2.8 is
satisfied. Let us now verify that (3.4) is
focusing. Towards this goal,
take z∈S and
suppose that ∑n∈N(1−δ2)⟨xn−z∣Bxn−Bz⟩<+∞ and ∑n∈N(1−κγn/α)Dfn(xn+1,xn)<+∞.
Since δ2<1 and supn∈N(κγn)<α,
we infer from (3.2) that
[TABLE]
and ∑n∈N∥xn+1−xn∥2=2∑n∈NDf(xn+1,xn)⩽(2/α)∑n∈NDfn(xn+1,xn)<+∞.
It follows that
[TABLE]
Now take x∈W(xn)n∈N, say xkn⇀x,
and set (∀n∈N)
xn+1∗=γn−1Un(xn−xn+1)−Bxn.
It results from (3.4) that
(xkn+1,xkn+1∗)n∈N lies in graA
and from (3.9) that xkn+1⇀x.
Moreover, (3.9) yields xkn+1∗+Bxkn→0.
Altogether, Lemma 3.1
(applied to the sequences (xkn+1,xkn+1∗)n∈N
in graA and (xkn,Bxkn)n∈N in graB)
guarantees that x∈zer(A+B).
Consequently, Theorem 2.8 asserts that (xn)n∈N
converges weakly to a point in S.
Example 3.4
The classical forward-backward method is obtained
by setting Un≡Id in Corollary 3.3, which yields
[TABLE]
The case when the proximal parameters (γn)n∈N
are constant was first addressed in [17].
We now turn to the Renaud–Cohen algorithm (1.7)
and recover [20, Theorem 3.4].
Corollary 3.5
Let X be a real Hilbert space, let A:X→2X
and B:X→X be maximally monotone,
and let f:X→R be convex and Fréchet differentiable.
Suppose that zer(A+B)=∅, that
∇f is 1-strongly monotone on domA
and Lipschitzian on bounded sets, and that there exists
β∈]0,+∞[ such that
[TABLE]
Let γ∈]0,2β[,
take x0∈domA, and set (∀n∈N)
xn+1=(∇f+γA)−1(∇f(xn)−γBxn).
Suppose, in addition, that ∇f is
weakly sequentially continuous.
Then (xn)n∈N converges weakly to a point in
zer(A+B).
Proof. Let ε∈]0,2β[ be such that
γ<2β−ε.
We apply Theorem 2.8 with C=domA, α=1,
κ=1/(2β−ε), δ1=δ2=(2β−ε)/(2β)∈]0,1[, and
(∀n∈N) fn=f and ηn=0.
Proposition 2.1(iv) asserts that
(1.1) is satisfied.
Furthermore, as shown in the proof of
Proposition 2.1(iv),
[TABLE]
Next, note that conditions [a] and [b]
in Algorithm 2.4 are trivially
satisfied. Since ∇f+γA is strongly
monotone and since, by [7, Corollary 25.5(i)],
∇f+γA is maximally monotone,
it follows from [7, Proposition 22.11(ii)] that
ran(∇f+γA)=X and therefore that
condition [c] in Algorithm 2.4 holds.
We observe that
condition [b] in Theorem 2.8 is trivially satisfied
and that condition [a] in Theorem 2.8 follows from
(3.12) and Proposition 2.5(i).
Furthermore, since ∇f is weakly sequentially
continuous and 1-strongly monotone on C,
condition [d][d]2/ in Theorem 2.8 is satisfied
with g=f. Now take z∈zer(A+B)
and suppose that
∑n∈N(1−κγ)Df(xn+1,xn)<+∞,
∑n∈N(1−δ2)⟨xn−z∣Bxn−Bz⟩<+∞,
and ∑n∈N⟨xn+1−z∣γ−1(∇f(xn)−∇f(xn+1))−Bxn+Bz⟩<+∞. Then, since
κγ<1 and δ2<1, it follows that
[TABLE]
and therefore that
[TABLE]
Since (z,0)∈gra(A+B) and since the sequence
(xn+1,γ−1(∇f(xn)−∇f(xn+1))−Bxn+Bxn+1)n∈N lies in gra(A+B) by construction,
it follows from (3.11) and (3.14) that
∑n∈N∥Bxn−Bz∥2<+∞.
On the other hand, since (xn)n∈N lies in domA by
Proposition 2.5, we deduce from (3.12)
and (3.13) that
xn+1−xn→0. In turn, it results from the Lipschitz
continuity of ∇f on the bounded set
{xn}n∈N that ∇f(xn)−∇f(xn+1)→0. Now take x∈W(xn)n∈N, say xkn⇀x,
and set (∀n∈N) xn+1∗=γ−1(∇f(xn)−∇f(xn+1))−Bxn.
Then (xkn+1,xkn+1∗)n∈N lies in
graA. Furthermore, xkn+1∗+Bxkn=γ−1(∇f(xkn)−∇f(xkn+1))→0
and, since xn−xn+1→0, xkn+1⇀x.
Thus, applying Lemma 3.1 with the sequences
(xkn+1,xkn+1∗)n∈N and
(xkn,Bxkn)n∈N yields x∈zer(A+B), and we
conclude that condition [c] in Theorem 2.8 is
satisfied as well.
3.2 The finite-dimensional case
We discuss the finite-dimensional case, a setting in which the
assumptions can be greatly simplified and the results presented
below are new.
Corollary 3.6
Let (xn)n∈N be a sequence generated by
Algorithm 2.4. In addition, suppose that the following
hold:
- [a]
X* is finite-dimensional.*
2. [b]
f* is essentially strictly convex and domf∗ is open.*
3. [c]
(intdomf)∩domA⊂intdomB.
4. [d]
supn∈N(κγn)<α.
5. [e]
There exists a function g in Γ0(X) which is
differentiable on intdomg⊃intdomf, with
∇g strictly monotone on C, and such that,
for every sequence (yn)n∈N in C and every
sequential cluster point y∈intdomf of (yn)n∈N,
ykn→y ⇒
∇fkn(ykn)→∇g(y).
Then (xn)n∈N converges to a point in S.
Proof. It follows from
Proposition 2.5(v)(v)[e] that
(xn)n∈N is bounded and from
Proposition 2.6[d] that
W(xn)n∈N⊂intdomf.
In view of Theorem 2.8, it remains to show that
Algorithm 2.4 is focusing.
Towards this goal, let z∈S,
and suppose that (Dfn(z,xn))n∈N
converges and ∑n∈N(1−κγn/α)Dfn(xn+1,xn)<+∞,
and let x be a sequential cluster point of
(xn)n∈N, say xkn→x.
Using [d] and the fact that
(fn)n∈N lies in Cα(f), we obtain
[TABLE]
Since (xkn)n∈N lies in intdomf,
[4, Theorem 3.8(ii)] and (3.15) imply that
[TABLE]
and [5, Theorem 5.10] thus yields
[TABLE]
Next, it results from [b],
[5, Lemma 7.3(vii)], and (3.15) that
[TABLE]
Therefore, since ∇f(z)∈intdomf∗
[5, Theorem 5.10] and since f∗ is a Legendre
function [5, Corollary 5.5], it results from
[5, Lemma 7.3(v)] that (∇f(xkn+1))n∈N
is bounded. In turn, there exists a
strictly increasing sequence (lkn)n∈N in
N and a point x∗∈X∗ such that
[TABLE]
By lower semicontinuity of Df∗(⋅,∇f(z)) and (3.18), x∗∈domf∗. On the other
hand, appealing to [5, Lemma 7.3(vii)] and
(3.15), we obtain
[TABLE]
Thus, since (∇f(xn))n∈N lies in intdomf∗
by virtue of Proposition 2.5 and
[5, Theorem 5.10], we derive from
[4, Theorem 3.9(iii)], (3.17), and (3.19)
that x∗=∇f(x) and, hence, from (3.19) that
∇f(xlkn+1)→∇f(x).
It thus follows from [5, Theorem 5.10] that
xlkn+1→x. In turn, by using respectively [e]
with the sequences (xn)n∈N and (xn+1)n∈N,
we get ∇flkn(xlkn)→∇g(x) and
∇flkn(xlkn+1)→∇g(x).
Now set (∀n∈N) xn+1∗=γn−1(∇fn(xn)−∇fn(xn+1))−Bxn.
Then, by construction of (xn)n∈N,
(∀n∈N) (xn+1,xn+1∗)∈graA.
In addition, since infn∈Nγn>0
and ∇flkn(xlkn)−∇flkn(xlkn+1)→∇g(x)−∇g(x)=0,
we deduce that xlkn+1∗+Bxlkn→0.
On the other hand, since (xn)n∈N lies in domA
and xkn→x, it follows that x∈domA
and therefore, by (3.16) and [c], that
x∈intdomB. Hence, using [21, Corollary 1.1],
we obtain Bxlkn→Bx.
Altogether, Lemma 3.1 (applied to the sequence
(xlkn+1,xlkn+1∗)n∈N in graA
and the sequence (xlkn,Bxlkn)n∈N in
graB) asserts that x∈zer(A+B).
In view of Theorem 2.8, we conclude that (xn)n∈N
converges to a point in S.
3.3 Forward-backward splitting for convex minimization
In this section, we study the convergence of (1.6). Our
results improve on and complement those of [18].
Problem 3.7
Let φ∈Γ0(X), let ψ∈Γ0(X),
and let f∈Γ0(X) be essentially smooth.
Set C=(intdomf)∩dom∂φ
and S=(intdomf)∩Argmin(φ+ψ).
Suppose that φ+ψ is coercive,
∅=C⊂intdomψ, S=∅,
ψ is Gâteaux differentiable on intdomψ, and
there exists κ∈]0,+∞[ such that
[TABLE]
The objective is to find a point in S.
In the context of Problem 3.7, given
γ∈]0,+∞[ and g∈Cα(f), we define
proxγφg=(∇g+γ∂φ)−1.
Algorithm 3.8
Consider the setting of Problem 3.7.
Let α∈]0,+∞[,
let (γn)n∈N be in ]0,+∞[, and
let (fn)n∈N be in Cα(f).
Suppose that the following hold:
- [a]
There exists ε∈]0,1[
such that
0<infn∈Nγn⩽supn∈Nγn⩽α(1−ε)/κ.
2. [b]
There exists a summable sequence (ηn)n∈N in [0,+∞[
such that (∀n∈N)
Dfn+1⩽(1+ηn)Dfn.
3. [c]
For every n∈N, intdomfn=dom∂fn
and ∇fn is strictly monotone on C.
Take x0∈C and set (∀n∈N)
xn+1=proxγnφfn(∇fn(xn)−γn∇ψ(xn)).
Theorem 3.9
Let (xn)n∈N be a sequence generated by
Algorithm 3.8 and suppose that the following hold:
- [a]
W(xn)n∈N⊂intdomf.
2. [b]
One of the following is satisfied:
- 1/
S* is a singleton.*
2. 2/
There exists a function g in Γ0(X) which is
Gâteaux differentiable on intdomg⊃C, with
∇g strictly monotone on C, and such that,
for every sequence (yn)n∈N in C and every
y∈W(yn)n∈N∩C,
ykn⇀y ⇒
∇fkn(ykn)⇀∇g(y).
Then the following hold:
- (i)
(xn)n∈N* converges weakly to a point in
S.*
2. (ii)
(xn)n∈N* is a monotone minimizing sequence:
φ(xn)+ψ(xn)↓min(φ+ψ)(X).*
3. (iii)
∑n∈N((φ+ψ)(xn)−min(φ+ψ)(X))<+∞*
and (φ+ψ)(xn)−min(φ+ψ)(X)=o(1/n).*
4. (iv)
∑n∈Nn(Dfn(xn+1,xn)+Dfn(xn,xn+1))<+∞.
Proof. (i):
We shall derive this result from Theorem 2.8 with
A=∂φ, B=∂ψ, δ1=0,
and δ2=1.
First, appealing to [24, Theorem 2.4.4(i)],
B is single-valued on intdomB=intdomψ and B=∇ψ
on intdomB. Next, set θ=φ+ψ.
Since ∅=(intdomf)∩dom∂φ⊂intdomψ, we have
domφ∩intdomψ=∅. Hence,
[9, Theorem 4.1.19] yields A+B=∂θ.
Therefore, Argminθ=zer∂θ=zer(A+B)
and S=(intdomf)∩zer(A+B).
Next, in view of Proposition 2.1(iii),
(1.1) is fulfilled.
On the other hand, conditions [a] and [b]
in Algorithm 2.4 are trivially satisfied.
To verify condition [c] in Algorithm 2.4,
it suffices to show that,
for every n∈N, (∇fn−γnB)(C)⊂ran(∇fn+γnA), i.e.,
since C⊂intdomB and B=∇ψ on intdomB,
that (∇fn−γn∇ψ)(C)⊂ran(∇fn+γnA). To do so,
fix temporarily n∈N, let x∈C, and set
[TABLE]
Then, since dom∂fn∩domA=(intdomfn)∩domA=(intdomf)∩domA=∅ by condition [c] in
Algorithm 3.8, it results from
[6, Proposition 3.12] that An is maximally
monotone. Next, we deduce from condition [a] in
Algorithm 3.8 and (3.21) that
[TABLE]
In turn,
[TABLE]
However, by coercivity of θ, there exists
ρ∈]0,+∞[ such that
[TABLE]
Now suppose that (y,y∗)∈graAn(⋅+x) satisfies
∥y∥⩾ρ. Then y+x∈dom∇fn∩domA=(intdomfn)∩domA=C and
y∗−∇fn(y+x)+γn∇ψ(y+x)+∇fn(x)−γn∇ψ(x)∈γn(A+B)(y+x).
Thus, it follows from (3.25) and (3.24) that
[TABLE]
Therefore, in view of [22, Proposition 2] and
the maximal monotonicity of An(⋅+x),
there exists y∈X such that
0∈An(y+x).
Hence (∇fn−γn∇ψ)(x)∈∇fn(y+x)+γnA(y+x)⊂ran(∇fn+γnA), as desired.
Since (xn+1,γn−1(∇fn(xn)−∇fn(xn+1))−∇ψ(xn))
lies in gra∂φ by construction,
we derive from [6, Proposition 2.3(ii)] that
[TABLE]
On the other hand, (3.23) and the convexity of ψ
entail that
[TABLE]
Altogether, upon adding (3.3) and (3.28), we
obtain
[TABLE]
In particular, since xn∈C,
[TABLE]
This shows that
[TABLE]
In turn, using the coercivity of θ,
we infer that (xn)n∈N is bounded, which secures
[a] in Theorem 2.8. It remains to verify
that Algorithm 3.8 is focusing.
Towards this end, let z∈S and suppose that
[TABLE]
and
[TABLE]
Set γ=infn∈Nγn and ℓ=limDfn(z,xn).
It follows from (3.29) applied to z∈C that
[TABLE]
and therefore from condition [b] in
Algorithm 3.8 that
[TABLE]
Hence,
limγ(θ(xn+1)−minθ(X))+ℓ⩽ℓ
and therefore lim(θ(xn+1)−minθ(X))=0.
Thus
[TABLE]
Now take x∈W(xn)n∈N, say xkn⇀x.
By weak lower semicontinuity of θ,
minθ(X)⩽θ(x)⩽limθ(xkn)=minθ(X)
and it follows that x∈Argminθ=zer(A+B).
Consequently, Theorem 2.8 asserts that (xn)n∈N
converges weakly to a point in S.
(ii): Combine (3.31) and (3.36).
(iii)&(iv):
Fix z∈S and set γ=infn∈Nγn.
Arguing along the same lines as above, we obtain
[TABLE]
and therefore [7, Lemma 5.31] guarantees that
∑n∈N(θ(xn)−minθ(X))<+∞.
In addition, (θ(xn)−minθ(X))n∈N
is decreasing by virtue of (3.31). However, recall that
if (αn)n∈N is a decreasing sequence in [0,+∞[
such that ∑n∈Nαn<+∞, then
[TABLE]
Hence, θ(xn)−minθ(X)=o(1/n)
and ∑n∈Nn(θ(xn)−θ(xn+1))<+∞.
Consequently, since (3.29) yields
[TABLE]
we infer that
∑n∈Nn(Dfn(xn+1,xn)+Dfn(xn,xn+1))<+∞.
Remark 3.10
Let us relate Theorem 3.9 to the literature.
- (i)
The conclusions of items (i) and (ii)
are obtained in
[18, Theorem 1(2)] under
more restrictive conditions on the sequences
(γn)n∈N and (fn)n∈N. Thus,
we do not require in Theorem 3.9 the additional
condition (∀n∈N)
(1+ηn)γn−γn+1⩽αηn/κ.
Furthermore, we do not suppose either that
−ran∇ψ⊂domφ∗ or that the functions
(fn)n∈N are cofinite.
2. (ii)
Items (iii) and (iv) are new even in
Euclidean spaces. In the finite-dimensional setting,
partial results can be found in [3], where:
- (a)
A single convex function is used: (∀n∈N) fn=f.
2. (b)
The viability of the sequence (xn)n∈N is a blanket
assumption, while it is guaranteed in Theorem 3.9.
3. (c)
Only the rates ∑n∈NDf(xn+1,xn)<+∞ and
(φ+ψ)(xn)−min(φ+ψ)(X)=O(1/n) are obtained.
3.4 Further applications
Theorems 2.8 and 3.9 operate under broad assumptions
which go beyond those of the existing forward-backward methods of
[6, 15, 18, 20] described in
(1.4)–(1.7). Here are two examples which do not fit
the existing scenarios and exploit this generality.
Example 3.11
Consider the setting of Problem 1.1.
Suppose, in addition, that the following hold:
- [a]
A is uniformly monotone on bounded sets.
2. [b]
There exist ψ∈Γ0(X) and κ∈]0,+∞[
such that B=∂ψ and (∀x∈C)(∀y∈C)
Dψ(x,y)⩽κDf(x,y).
3. [c]
f is supercoercive.
4. [d]
zer(A+B)⊂intdomf.
Let (γn)n∈N be a sequence in ]0,+∞[ such that
0<infn∈Nγn⩽supn∈Nγn<1/κ,
take x0∈C, and set (∀n∈N) xn+1=(∇f+γnA)−1(∇f(xn)−γn∇ψ(xn)).
Then (xn)n∈N converges strongly to the unique zero of
A+∇ψ.
The next example concerns variational inequalities.
Example 3.12
Let φ∈Γ0(X), let B:X→2X∗
be maximally monotone, let f∈Γ0(X) be
essentially smooth, and set
C=(intdomf)∩dom∂φ.
Suppose that C⊂intdomB and B is single-valued on
intdomB. Consider the problem of finding a point in
[TABLE]
which is assumed to be nonempty. This is a special case of
Problem 1.1 with A=∂φ and,
given x0∈C, Algorithm 2.4 produces the iterations
(∀n∈N) xn+1=proxγnφfn(∇fn(xn)−γnBxn). The weak convergence of
(xn)n∈N to a point in S is discussed in
Theorem 2.8. Even in Euclidean spaces, this scheme is new and
of interest since, as shown in [3, 13, 18], the
Bregman proximity operator proxγnφfn may be
easier to compute for a particular fn than for the standard
kernel ∥⋅∥2/2. Altogether, our framework makes it
possible to solve variational inequalities by forward-backward
splitting with non-cocoercive operators and/or outside of Hilbert
spaces.