A Lyapunov-type approach to convergence of the Douglas-Rachford algorithm
Minh N. Dao, Matthew K. Tam

TL;DR
This paper introduces a Lyapunov-type approach to prove convergence of the Douglas-Rachford algorithm in nonconvex settings, expanding its theoretical understanding and applicability.
Contribution
It provides the first convergence proof for Douglas-Rachford in certain nonconvex problems using a Lyapunov functional approach.
Findings
Convergence is established in nonconvex scenarios.
The Lyapunov functional does not require convexity of the original sets.
Examples demonstrate global convergence in nonconvex cases.
Abstract
The Douglas-Rachford projection algorithm is an iterative method used to find a point in the intersection of closed constraint sets. The algorithm has been experimentally observed to solve various nonconvex feasibility problems which current theory cannot sufficiently explain. In this paper, we prove convergence of the Douglas-Rachford algorithm in a potentially nonconvex setting. Our analysis relies on the existence of a Lyapunov-type functional whose convexity properties are not tantamount to convexity of the original constraint sets. Moreover, we provide various nonconvex examples in which our framework proves global convergence of the algorithm.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
A Lyapunov-type approach to convergence of the Douglas–Rachford algorithm
Minh N. Dao CARMA, University of Newcastle, Callaghan, NSW 2308, Australia. E-mail: [email protected]
Matthew K. Tam Institut für Numerische und Angewandte Mathematik, Universität Göttingen, 37083 Göttingen, Germany. E-mail: [email protected]
Abstract
The Douglas–Rachford projection algorithm is an iterative method used to find a point in the intersection of closed constraint sets. The algorithm has been experimentally observed to solve various nonconvex feasibility problems which current theory cannot sufficiently explain. In this paper, we prove convergence of the Douglas–Rachford algorithm in a potentially nonconvex setting. Our analysis relies on the existence of a Lyapunov-type functional whose convexity properties are not tantamount to convexity of the original constraint sets. Moreover, we provide various nonconvex examples in which our framework proves global convergence of the algorithm.
Mathematics Subject Classification (MSC 2010): 90C26 47H10 37B25
Keywords: Douglas–Rachford algorithm, feasibility problem, global convergence, graph of a function, linear convergence, Lyapunov function, method of alternating projections, Newton’s method, nonconvex set, projection, stability, zero of a function
1 Introduction
The Douglas–Rachford algorithm (DRA) is an iterative method used to solve the so-called feasibility problem which asks for a point in the intersection of closed constraint sets. The method generates a sequence by combining the nearest point projectors of the individual constraint sets with exploiting the structure of problems in which these individual projectors can be efficiently computed or, at least, more efficiently than a direct attempt to solve the original problem. The origins of the method can be traced to work of Douglas & Rachford [DR56] where it was proposed as a method for numerically solving problems arising in heat conduction. In the convex setting, the situation is fairly well understood; convergence is due to Lions & Mercier [LM79] and has since been refined in various works [BCL04, BDM16, BM17, Sva11].
In the absence of convexity, Borwein & Sims [BS11] established local convergence of the DRA applied to a prototypical nonconvex feasibility problem involving a line and sphere. Here “prototypical” is meant in the sense of being an accessible model for imaging problems where phase is to be reconstructed from magnitude measurements whilst retaining the mathematical complexities. The same prototype has since studied in [AB13, Gil16]. In their paper, Borwein & Sims [BS11] conjectured that the DRA was actually globally convergent; a conjecture that was recently resolved in the affirmative by Benoist [Ben15] through a cleverly constructed Lyapunov function. Global convergence of the DRA for prototypical combinatorial optimization problems has been also proven in [ABT16, BD16].
A general approach to convergence of the DRA without convexity was provided by Phan [Pha16], to which work of Luke & Hesse [HL13] was a precursor. This approach follows related works, originating from [LLM09], which focus on the method of alternating projections and assume that local regularity properties of the underlying constraint sets hold near solutions of the problem (see also [NR16]). The main difficulty in applying these results, lies in that they give little information regarding the region of convergence, that is, the starting points from which the algorithm converges. Moreover, in practice, finding a point sufficiently close to a solution of a feasibility problem is often just as difficult as solving the original feasibility problem itself.
In this work, we generalize Benoist’s approach to construction of Lyapunov-type functionals as a tool to prove convergence of the DRA. In particular, we show that convergence of the DRA is ensured provided that the constructed Lyapunov-type function possess appropriate convexity properties. We emphasize here that the convexity properties of the Lyapunov-type function are independent of convexity of the underlying feasibility problem. As a consequence of our analysis, a region of convergence of the DRA can be identified by analyzing the Lyapunov-type function associated with the problem at hand.
The remainder of this paper is organized as follows. In Section 2, we introduce the necessary notions of nonsmooth analysis. In Section 3, we give a precise description of the Douglas–Rachford operator. In Section 4 we provide conditions under which the DRA enjoys stability properties near fixed points. Our Lyapunov approach to convergence of the algorithm follows in Section 5. Examples to which the results apply are considered in Section LABEL:s:examples together with some counter-examples to demonstrate that both the method of alternating projections and Newton’s method can fail to converge to a solution even when the DRA does. In fact, global convergence of the DRA is obtained in all bar one of our provided examples.
2 Preliminaries
In this section we introduce and recall necessary notions and tools from nonsmooth analysis. Throughout this work, we assume that
[TABLE]
(i.e., a finite-dimensional real Hilbert space) with inner product and induced norm . Given two real Hilbert spaces and with corresponding inner products denoted and , where appropriate, we use the product space which is a Hilbert space when equipped with the inner product defined by
[TABLE]
We denote the set of nonnegative integers is by and the set of real numbers by . The set of nonnegative real numbers is denoted \mathbb{R}_{+}:=\{{x\in\mathbb{R}}~{}\big{|}~{}{x\geq 0}\} and set of the positive real numbers \mathbb{R}_{++}:=\{{x\in\mathbb{R}}~{}\big{|}~{}{x>0}\}. The sets of nonpositive and negative real numbers, denoted and respectively, are defined analogously. Given a subset of , its closure and interior are denoted respectively by and . For a point and scalar , the closed ball centered at with radius is denoted \mathbb{B}\left({x};{\rho}\right):=\{{y\in X}~{}\big{|}~{}{\|x-y\|\leq\rho}\}.
2.1 The Douglas–Rachford algorithm
In this section, we recall the background material for the Douglas–Rachford algorithm. Let be a nonempty subset of . The projector onto is the mapping
[TABLE]
where is the distance from to . Each is a nearest point of in , and called a projection of onto . Since we consider only finite-dimensional spaces , closedness of the set is necessary and sufficient for being proximinal, i.e., ; see [BC11, Corollary 3.13]. In an abuse of notation, we write whenever .
Let and be closed subsets of such that . The feasibility problem is
[TABLE]
A classical splitting method for solving (4) is the so-called Douglas–Rachford algorithm which is concisely described as the fixed point iteration corresponding to the Douglas–Rachford (DR) operator defined by
[TABLE]
where is the identity operator, and and are the reflectors across and , respectively. A sequence is called a DR sequence (with respect to ), with starting point , if
[TABLE]
where we note that
[TABLE]
In the literature, the DRA for feasibility problems is also known as averaged alternating reflections [BCL04] and reflect-reflect-average method [BS11]. For other connections, we refer the reader to [BCL02].
The following fact gives important properties of convex projectors.
Fact 2.1** (Projectors and reflectors onto convex sets).**
Let be a nonempty closed convex subset of . Then the following hold:
- (i)
* is everywhere single-valued and firmly nonexpansive, that is,*
[TABLE] 2. (ii)
* is everywhere single-valued and nonexpansive, that is,*
[TABLE]
In particular, and are continuous on .
Proof.
(i): See [BC11, Theorem 3.14 & Proposition 4.8]. (ii): This follows from (i) and [BC11, Corollary 4.10]. ∎
In the case that one of the constraints is convex, we make the following observations. In what follows, recall that a sequence is asymptotically regular if as .
Lemma 2.2** (Properties of the DRA).**
Let be a closed convex subset and be a closed subset of such that , and let be a DR sequence with respect to . Then the following hold:
- (i)
** 2. (ii)
* whenever and .* 3. (iii)
If is asymptotically regular and possess a cluster point , then .
Proof.
(i): Combine (7) with the single-valuedness of (2.1).
(ii): Let , let , let , and let . Then there exists such that . Since , it follows that and thus
[TABLE]
where the last estimate follows from the nonexpansiveness of (2.1). Altogether, we obtain that , hence and the result follows.
[TABLE]
Let be a cluster point of . Then there exists a subsequence of such that . By 2.1, is continuous and so . Combining with (11) and the asymptotic regularity of , this gives . Noting that and that is closed, we deduce that and therefore . ∎
2.2 Convexity
Given an extended-real-valued function , its effective domain is denoted by \operatorname{dom}f:=\{{x\in X}~{}\big{|}~{}{f(x)<+\infty}\}, its graph by \operatorname{gra}f:=\{{(x,\rho)\in X\times\mathbb{R}}~{}\big{|}~{}{f(x)=\rho}\}, its epigraph by \operatorname{epi}f:=\{{(x,\rho)\in X\times\mathbb{R}}~{}\big{|}~{}{f(x)\leq\rho}\}, and its lower level set at height by \operatorname{lev}_{\leq\xi}f:=\{{x\in X}~{}\big{|}~{}{f(x)\leq\xi}\}. The function is said to be proper if and it never takes the value , lower semicontinuous (lsc) if for every , and convex if
[TABLE]
Let be proper. Then is said to be strictly convex if, in addition to being convex, the inequality in (12) is strict whenever . We say that is convex on (respectively strictly convex on ) if the corresponding inequality holds whenever and . ity.
Fact 2.3**.**
Let be a proper function and let be a nonempty open convex subset of .
- (i)
Suppose that is Gâteaux differentiable on . Then the following hold:
- (a)
* is convex on if and only if is monotone on in the sense that*
[TABLE] 2. (b)
* is strictly convex on if and only if is strictly monotone on in the sense that*
[TABLE] 2. (ii)
Suppose that is twice Gâteaux differentiable on . Then the following hold:
- (a)
* is convex on if and only if is positive semidefinite for every .* 2. (b)
* is strictly convex on if is positive definite for every .*
Proof.
This follows from [BC11, Propositions 17.10 & 17.13]. ∎
2.3 Subdifferentiability
The limiting normal cone to a subset of at a point is defined by
[TABLE]
if , and by otherwise. Here the notation means with .
Let , let with , and let . The limiting subdifferential of at is given by
[TABLE]
and the analytic -subdifferential of at is given by
[TABLE]
Both subdifferentials of at a point are defined to be empty when . The limiting subdifferential can be represented in analytic form [Mor06, Theorem 1.89]
[TABLE]
where the notation means with and
[TABLE]
denotes the sequential Painlevé–Kuratowski upper limit of at .
We now recall some important properties of the limiting subdifferential.
Fact 2.4** (Fermat’s rule).**
Suppose that a function attains a local minimum at a point with . Then .
Proof.
This follows from [Mor06, Proposition 1.114]. ∎
Fact 2.5** (Sum and product rules).**
Consider two functions, and , and let . The following assertions hold.
- (i)
If is finite at and is strictly differentiable at , then
[TABLE] 2. (ii)
If and are Lipschitz continuous around , then
[TABLE]
Proof.
(i): [Mor06, Proposition 1.107]. (ii): [Mor06, Propositions 1.111 and 3.45]. ∎
If is lsc around , then a convenient, and often used, representation for the limiting subdifferential is given by [Mor06, Theorems 1.89 & 2.34]
[TABLE]
where is the so-called Fréchet subdifferential of . However, in what follows, it will be necessary to consider the subdifferentials of both and simultaneously. In this case, (22) cannot not be applied because and are usually not simultaneously lsc (e.g., if takes the value ). Further, it is worth emphasizing, that and are, in general, considerably different. For instance, the function has
[TABLE]
Combining these two subdifferentials yields the symmetric subdifferential of which is defined by
[TABLE]
where is the so-called limiting upper subdifferential of . In contrast to the limiting subdifferential, the symmetric subdifferential possess the classical “plus-minus” symmetry (i.e., ). Also note that, if is strictly differentiable at , then [Mor06, Corollary 1.82]
[TABLE]
If is convex, then the limiting subdifferential reduces to the convex subdifferential (or Fenchel subdifferential) of convex analysis [Mor06, Theorem 1.93], that is,
[TABLE]
and we have the inclusions
[TABLE]
The following property for the limiting subdifferential is mentioned without proof in [Mor06, RW98] and we therefore we provide one for the convenience of the reader. Furthermore, note that lower semicontinuity is not assumed and so we cannot simply appeal to the representation (22).
Lemma 2.6** (Scalar multiplication rule).**
Let . Then
[TABLE]
Proof.
Let . Then if and only if for all . In particular, this shows that (28) holds at points at which is not finite. Assume now that with . By the definition of the analytic -subdifferential of ,
[TABLE]
Hence, if , then as , we may apply (18) to deduce that
[TABLE]
The argument for is performed analogously. ∎
Remark 2.7** (Multiplication by zero).**
Care must be exercised in the case that in Lemma 2.6. Consider, for instance, the lsc convex function defined by
[TABLE]
which has . Under the convention that , it follows that , where is the indicator function of a set , so that and
[TABLE]
Alternatively, under the convention that as suggested in [RW98, Section 1E], we have and hence that .
For our purposes, both conventions are problematic, and thus we shall treat the cases of directly as it arises.
As holds for the limiting subdifferential, the symmetric subdifferential also enjoys the following robustness property.
Lemma 2.8** (Robustness of the symmetric subdifferential).**
Let and let with . Then the symmetric subdifferential has the following robustness property
[TABLE]
Proof.
It is clear that
[TABLE]
To prove the opposite inclusion, we assume that and with . Since , by passing to subsequences if necessary, it suffices to prove the results assuming that the sequence is contained only in either or . To this end, by a diagonal subsequence argument we derive from (18) that and both have the robustness property. Thus, in either case, the result follows. ∎
Lemma 2.9** (Upper semicontinuity of the symmetric subdifferential).**
Let be Lipschitz continuous around with , and consider sequences and in such that and for every . Then is bounded and its cluster points are contained in .
Proof.
By assumption, there exist a neighborhood of and a constant such that is Lipschitz continuous on with modulus . In particular, and are Lipschitz continuous around each with modulus . By [Mor06, Corollary 1.81] and (24), we have that
[TABLE]
Since , it follows that is bounded.
Let be a cluster point of . Then there is a subsequence converging to . Noting that , the Lipschitz continuity of around yields . Now apply Lemma 2.8. ∎
2.4 Coercivity
Recall that a function is coercive if
[TABLE]
For convenience, we recall some basic properties of coercivity.
Fact 2.10** (Coercive functions).**
Let . Then the following hold:
- (i)
* is coercive if and only if its lower level sets are bounded for all .* 2. (ii)
If is proper, convex, and coercive, then .
Proof.
(i): [BC11, Proposition 11.11]. (ii): [BCN06, Lemma 2.13]. ∎
The following preparatory lemma shows that coercivity is preserved under direct sums.
Lemma 2.11**.**
Let and let , where and are real Hilbert spaces. Set . Suppose that and are proper, convex, and coercive on and , respectively. Then is proper, convex, and coercive on .
Proof.
It immediately follows by assumption and definition that is proper and convex. Now 2.10(ii) implies that
[TABLE]
Suppose, by way of a contradiction, that is not coercive. Then, there exists a sequence in such that and is bounded above that is, there exists such that
[TABLE]
Combining with (37), we obtain that and are bounded above. But since and are coercive, and must therefore be bounded, and thus so is which contradicts the fact that . ∎
3 The Douglas–Rachford algorithm for finding a zero of a function
From herein, we assume that
[TABLE]
Note that, since is assumed proper, is necessarily a closed set whenever is continuous throughout its effective domain in the sense that
[TABLE]
As the following examples show, the converse need not hold (i.e., the graph of a discontinuous function can be closed) and, in general, mere lsc is not sufficent to ensure closedness of the graph.
Example 3.1** (A discontinuous, lsc function with closed graph).**
Consider defined by
[TABLE]
Then is continuous except at where it is merely lsc. In particular, is lsc but not continuous. However, does have a closed graph. Indeed, the graph of may be expressed as the union of two closed sets: where we note that is closed since is continuous on its domain. ∎
It is known, see for instance [BC11, Corollary 9.15], that every proper lsc convex function is continuous throughout the closure of and hence has a closed graph. However, this does not hold for proper lsc convex functions in which, as a consequence, gives rise to the following example.
Example 3.2** (A proper lsc convex function with nonclosed graph).**
Consider defined by
[TABLE]
Then is proper, lsc, and convex, as shown in [BC11, Example 9.27]. Now setting , we have that the sequence lies in but its limit , hence is not closed.∎
Our focus is the feasibility problem (4) in the product Hilbert space with constraints
[TABLE]
where . Note that, in the case in which is the epigraph of a proper lower semicontinuous function (and hence a nonempty closed convex set), the convergence of the Douglas–Rachford algorithm was previously studied in [BD16, BDNP16a, BDNP16]. Until this work, the case in which is the graph of a proper function had not been considered even for the class of convex functions. It is also clear that, equivalently, our problem may be posed as
[TABLE]
under the assumption that . In what follows, the sequence shall denote a DR sequence for (43), that is, any sequence which satisfies
[TABLE]
In this setting, the projector onto and the reflector across are given, respectively, by
[TABLE]
Although the two possible DR operators, and , associated with and give different algorithms, since is a subspace, it holds that for every (see [BM16, Theorem 2.7(i) & Remark 2.10(ii)-(iii)]). Thus in order to study the DRA corresponding to it suffices just to study the DRA corresponding to .
To begin, we collect some preparatory lemmas which we use to give a precise description of the DR iteration for the sets and in (43). Our first result is concerned with the range of the DR operator.
Lemma 3.3** (Range of ).**
The following assertions hold.
- (i)
. 2. (ii)
.
Proof.
Let . (i): Combining Lemma 2.2(i) and (46) yields
[TABLE]
(ii): Since , it follows from (i) that
[TABLE]
which completes the proof. ∎
Note that, in view of Lemma 3.3(ii), from now on it suffices to assume that
[TABLE]
In the following lemma, we turn our attention to the projector onto . The provided characterization for will then be used in Lemma 3.5 to describe the DR operator relative to .
Lemma 3.4** (Projector onto the graph of ).**
Let . Then and, for any , it holds that and . In addition, the following assertions hold.
- (i)
If is Lipschitz continuous around , then
[TABLE] 2. (ii)
If is convex and , then
[TABLE]
Proof.
The existence of a point is ensured since the set is a nonempty closed subset of . Since , it holds that and .
(i): Since , we have that
[TABLE]
and, applying Fermat’s rule (2.4), gives
[TABLE]
Using the sum and product rules (2.5) and noting that is continuously (Fréchet) differentiable and hence strictly differentiable on with (see, for instance, [BC11, Example 16.11 & Corollary 17.36]), we deduce that
[TABLE]
Now by the scalar multiplication rule (Lemma 2.6),
[TABLE]
Finally, if , then, since by assumption , the function is zero around . Consequently, \partial\big{(}2(f(p)-\rho)(f(\cdot)-\rho)\big{)}(p)=\{0\}=2(f(p)-\rho)\partial f(p) where due to [Mor06, Corollary 2.25]. Altogether, we have proven (50).
(ii): Since is proper and convex, is locally Lipschitz continuous on [BC11, Corollary 8.32]. The claim thus follows from (i). ∎
Lemma 3.5** (One DR step).**
Let and let . Then
[TABLE]
Suppose, in addition, that is Lipschitz continuous around . Then there exists such that
[TABLE]
and, furthermore, the following assertions hold.
- (i)
If is strictly differentiable at with , then . 2. (ii)
If is convex and , then either or .
Proof.
It follows from Lemma 3.3(i) that and from Lemma 3.4 that and . Altogether, and . The former implies that
[TABLE]
which completes the proof of (56).
Now assume that is Lipschitz continuous around . By Lemma 3.4(i),
[TABLE]
from which (57) follows since . Furthermore, we argue as follows.
(i): If is strictly differentiable at with , then , and so , which gives .
(ii): Suppose is convex, and . Then (56) yields
[TABLE]
Since , we have and hence . By (60), the inequality is actually strict, that is, which implies that and hence . ∎
Recall that the set of fixed points of is the set \operatorname{Fix}T_{A,B}:=\{{z\in X\times\mathbb{R}}~{}\big{|}~{}{z\in T_{A,B}z}\}. If and were convex sets, the fixed point of the DR operator can be precisely described [BCL04, Corollary 3.9]. Although is not convex in our setting, we are still, nevertheless, able to arrive at the following satisfactory characterization.
Lemma 3.6** (Fixed points of ).**
The following assertions hold.
- (i)
* and . Consequently,*
[TABLE] 2. (ii)
If , then
[TABLE] 3. (iii)
If is locally Lipschitz continuous on , then
[TABLE]
In particular, if , then . 4. (iv)
If is convex and , then
[TABLE]
In particular, if , then .
Proof.
(i): Let . Then, by Lemma 3.3(i), we have
[TABLE]
On the one hand, (65) implies , so that , and hence . On the other hand, (65) gives
[TABLE]
which proves that . We deduce that and . It straight-forward to show that from which it follows that .
(ii): We immediately have that . Now let . Again by Lemma 3.3(i), . It follows from and that
[TABLE]
and therefore , which yields , that is, . Hence .
(iii): Let . By (i), and . If , then , and hence the fixed point . If , then, by using Lemma 3.4(i),
[TABLE]
Thus either or which completes the proof of the claim.
(iv): By the assumptions on and [BC11, Corollary 8.32], is locally Lipschitz continuous on . The first claim by applying (iii) and notating from convexity that (see (27)).
To prove the second claim, suppose that there exists , that is, and . But then which contradicts the assumption that , hence we deduce that . The conclusion follows. ∎
Roughly speaking, Lemma 3.6 shows that the fixed point set of consists of two parts: the intersection and a set containing critical points of . In the following result, we give conditions under which the DRA stays away from critical points.
For convenience, we denote and the first coordinate projection by
[TABLE]
Corollary 3.7**.**
Suppose that one of the following holds:
- (i)
* is locally Lipschitz continuous on and, for every , either*
[TABLE] 2. (ii)
* is convex with open, and .*
Then the set S:=\{{{n\in{\mathbb{N}}}}~{}\big{|}~{}{\text{fx_{n}\nabla f(x_{n})=0}}\} is bounded.
Proof.
(i): By way of a contraction, suppose that is unbounded. In this case, we claim that and that the sequence is constant. To see this, observe that if (i.e., is strictly differentiable at with ), then Lemma 3.5(i) yields that . In particular, is strictly differentiable at with . The claim now follows by descending induction on .
Now, set for any . Let . For all , since , the definition of implies
[TABLE]
Since , (70) implies that either or . In the former case, Lemma 3.5 implies as , and hence . Since is fixed, (71b) implies that . Since was chosen arbitrarily, , which contradicts the fact that . The case in which is proven analogously.
(ii): By assumption and [BC11, Corollary 8.32], is locally Lipschitz continuous on . By convexity of , if , then , hence (70) is satisfied. The result now follows from (i). ∎
Remark 3.8**.**
A convex function is strictly differentiable at every point where it is Gâteaux differentiable. Indeed, supposing that a function is convex and Gâteaux differentiable at , it then follows, from (26) and [BC11, Proposition 17.26(i)], that is a singleton, and, from [BC11, Proposition 17.39], that is lsc at and . By combining with [RW98, Proposition 8.12 & Theorem 9.18(a) & (c)], is strictly differentiable at .
The following result shows that, under a differentiation assumption, the inverse of the DR operator is continuous. This property, and its connection to stability, is explored further in Section 4.
Corollary 3.9**.**
Suppose that is strictly differentiable on an open set contained in . Then
[TABLE]
and is continuous on . Consequently, if the limit of a convergent DR sequence is contained in , then it is necessarily a fixed point of with .
Proof.
Let . Then and there exists such that . Since is strictly differentiable on , it is Lipschitz continuous around with
[TABLE]
By Lemma 3.5, and , which proves (72). To deduce the continuity of , observe that, since is strictly differentiable on , is continuous on [RW98, Corollary 9.19(a)–(b)].
Finally, let be a DR sequence which converges to a point . Without loss of generally, we can and do assume that for every . Then, using the continuity of and the fact that gives
[TABLE]
which shows that and thus . In turn, applying Lemma 3.6(i) yields . ∎
4 Stability and local convergence
In this section, we use an inverse function argument to give a condition under which the DRA algorithm is stable around fixed points in the sense of Lipschitz continuity. Again, we emphasize that alone such results do not guarantee convergence of the DRA. This question will be addressed in Section 5.
To begin, we recall two facts which will be of use: an inverse function theorem, and the so-called Sherman–Morrison formula.
Fact 4.1** (Single-valued Lipschitzian invertibility).**
Let be strictly differentiable at . If is nonsingular, then has a Lipschitz continuous single-valued localization, , around for . Moreover, the Lipschitz modulus of at is equal to and is strictly differentiable at with .
Proof.
This is a special case of [RW98, Corollary 9.55]. ∎
Fact 4.2** (Sherman–Morrison formula).**
Let be a nonsingular square matrix and let and be column vectors of appropriate dimensions so that the following multiplication operators are well defined. Then the following assertions hold.
- (i)
If , then is nonsingular and
[TABLE] 2. (ii)
If is singular, then .
Proof.
(i): See [SM50]. (ii): This is the contrapositive of (i). ∎
We are ready to prove our main result regarding stability of the DRA. In the following, denotes the Löwner partial order on the space of symmetric matrices. We say that is twice strictly differentiable at if is differentiable around and is strictly differentiable at .
Theorem 4.3** (Stability of the DRA).**
Let , and suppose that is twice strictly differentiable at and that . Then has a Lipschitz continuous single-valued localization, , around for which is strictly differentiable at and has Lipschitz modulus at equal to where
[TABLE]
Furthermore, if , then and coincide on a neighborhood of .
Proof.
Since is twice strictly differentiable at , both exists and is Lipschitz continuous around . In particular, is continuous differentiable around and, consequently, strictly differentiable around . Therefore, for every with sufficiently close to , Corollary 3.9 gives that
[TABLE]
and, since is strictly differentiable at , is strictly differentiable at with Jacobian given by
[TABLE]
Now, by distinguishing two cases, we show that is nonsingular and
[TABLE]
Case 1: Assume . Then (78) becomes
[TABLE]
and hence . Noting that
[TABLE]
it follows from 4.2(i) that is nonsingular and that
[TABLE]
Therefore, and hence, in particular, is nonsingular. To estimate where , recall that where denotes the largest eigenvalue. Using block matrix inversion and (82) gives
[TABLE]
and so
[TABLE]
Let be an eigenvalue of , that is,
[TABLE]
If , then we must have , which occurs if and only if or . Otherwise, using (85) and 4.2(ii) yields
[TABLE]
Hence, either or . In either case,
[TABLE]
and, by noting that with equality if and only if , we deduce that
[TABLE]
Case 2: Assume . Then and, by Lemma 3.6(iii), (i.e., and ). In turn, (78) becomes
[TABLE]
and, since by assumption, we have so that
[TABLE]
where denotes the smallest eigenvalue. We therefore have that both and are nonsingular and, moreover, that
[TABLE]
Using (90) yields for every eigenvalue of , and as the matrix is symmetric, we have
[TABLE]
Noting that , we see that this completes the proof of (79).
In either of the above cases, we have that is nonsingular at and that satisfies (79). Now, as and is single-valued at , it follows that . By applying 4.1, we deduce that has a Lipschitz continuous single-valued localization, , around for which is strictly differentiable at , and which has Lipschitz modulus at equal to .
Further assume that . We shall show that coincides with around . First we note that since is a localization at for , by definition, there exist neighborhoods and of such that
[TABLE]
Now set such that and . Applying Lemma 2.2(ii) gives
[TABLE]
As , combining (93) with (94) gives that
[TABLE]
and since and is a singleton, the above inclusion must be an equality. This yields on , as was claimed. We therefore deduce that is single-valued and locally Lipschitz on with modulus at equal to satisfying (88). This completes the proof. ∎
A closer inspection of the proof of Theorem 4.3 shows that it actually proves -linear convergence of the DRA in a special case. Recall that a sequence is said to converge -linearly to with rate if
[TABLE]
Corollary 4.4** (Local -linear convergence of the DRA).**
Let , and suppose that and that is twice strictly differentiable at with . Then there exists such that is a single-valued contraction mapping on with . Furthermore, for any starting point , the DR sequence converges -linearly to with rate
[TABLE]
Proof.
By applying Theorem 4.3 to , there exists such that is single-valued and locally Lipschitz continuous on with modulus at equal to . From the definition of the Lipschitz modulus at , we have
[TABLE]
Let . Then, by shrinking if necessary, we have
[TABLE]
and hence is a (single-valued) contraction mapping on . Substituting and noting that yield
[TABLE]
which implies that and that the DRA sequence converges to whenever . Now since , the claimed -linear rate follows from (98). ∎
Remark 4.5**.**
Let and suppose that is twice strictly differentiable at . By Lemma 3.6(i), and so
[TABLE]
Differentiating the objective function twice gives
[TABLE]
If , then since and (Lemma 3.6(iii)), the second order optimality condition yields
[TABLE]
Let us compare (103) to the assumption in Theorem 4.3. The latter assumed that which is equivalent to
[TABLE]
a condition which is stronger than (103). Nevertheless, (104) holds as soon as one of the following holds: (i) , (ii) and is convex, or (iii) and is concave. In fact, when fails, unstable fixed points can arise as is the case in the following example.
Example 4.6** (An unstable fixed point).**
Consider and the function . Appealing to Theorem 4.3, we deduce that is single-valued and locally Lipschitz around the point . However, is not locally Lipschitz around the point . To see this, let and consider the point . We have from Lemma 3.3(i) that
[TABLE]
Let . Then (51) implies that
[TABLE]
To show that is in fact a fixed point, setting in (106), we deduce that or . Further we observe that it cannot be the case that since
[TABLE]
and so we conclude that , which together with (105) gives and hence .
Now, to see that is not stable (in the sense of Lipschitz continuity of ), consider the point can be made arbitrarily close to by choosing sufficiently small. For all , the optimality condition (106) has only one solution at . But this implies that
[TABLE]
and consequently that while , thus is not locally Lipschitz around . Note that it does not contradict Theorem 4.3 since the condition is not satisfied.∎
In a later example (Example LABEL:ex:p_norm), we show that in the setting of Example 4.6 the DRA is globally convergent.
Recall that a sequence is said to converge -linearly to a point if there exist constants and such that
[TABLE]
Clearly the notion of -linear convergence implies -linear convergence.
To complement the results in this section, we deduce following -linear convergence result using existing results in the literature. Note that, in contrast to setting of Theorem 4.3, the following result only applies to fixed points at which is nonsingular.
Proposition 4.7** (Local -linear convergence of the DRA).**
Let , and suppose that is continuously differentiable around with . Then there exists such that, for any starting point , the DR sequence converges -linearly to a point in .
Proof.
By assumption, is continuously differentiable on from some a neighborhood of . Define a function and let . Then is a neighborhood of , is a mapping, is a closed convex subset of ,
[TABLE]
In view of [RW98, Definition 10.23(b)], is amenable at and hence superregular at by [LLM09, Proposition 4.8]. Moreover, the normal cones to and can be described, respectively, by [Mor06, Proposition 1.2] and [RW98, Example 6.8] as
[TABLE]
Since it is assumed that , it follows that , that is, to say that is strongly regular at . The assumptions of [Pha16, Theorem 4.3] (or [DP16, Corollary 5.22]) are thus satisfied, from which the result follows. ∎
To conclude this section, we note that Theorem 4.3 applies in situations when does not Proposition 4.7. In a subsequent section, we shall revisit the following example.
Example 4.8** (A stable fixed point).**
Consider the function and the point . Then does not satisfy the assumptions of Proposition 4.7 at because . Nevertheless, as is twice continuously differentiable at with , Theorem 4.3 still applies and shows that the DR operator is single-valued and Lipschitz continuous around .∎
5 A Lyapunov-type approach to convergence
In this section, we prove convergence of the DRA assuming the existence of a Lyapunov-type function which is assumed to possess the following properties on a subset of . In fact, our framework also provides a procedure for the construction of such a function. In practice, this mean that the candidate Lyapunov-type function can be concretely constructed and its properties easily checked.
Assumption 5.1**.**
There exists a proper convex function and a nonempty convex subset of such that the following hold:
(i)
The subdifferential of satisfies
(\forall x\in D)\quad\partial F(x)\supseteq\begin{cases}\left\{{\frac{f(x)}{\|x^{*}\|^{2}}x^{*}}~{}\Big{|}~{}{x^{*}\in\partial^{0}\!f(x)}\right\}&\text{~{}if~{}}0\notin\partial^{0}\!f(x),\\ \{0\}&\text{~{}if~{}}f(x)=0.\end{cases}
(112)
(ii)
* is coercive.*
(iii)
* is continuous on .*
The intuition behind 5.1, specifically (112), is the similar to that proposed in [Ben15]. One seeks a function of the form
[TABLE]
such that for every , its level set at the point is tangent to , where . To do so, we construct an satisfying 5.1 by anti-subdifferentiating (112). An illustration of such a function is given in LABEL:fig:level_set. In particular, if the function is strictly differentiable at , then (112) becomes
[TABLE]
and further, when , then the expression further simplifies to .
The two piecewise-defined cases in (112) are consistent in the sense that, if and , then both cases yield . The inclusion of the “” case allows our analysis to include situations in which the “” case has a remove discontinuity.
