Golden ratio algorithms with new stepsize rules for variational   inequalities

Dang Van Hieu; Yeol Je Cho; and Yi-bin Xiao

arXiv:1904.07591·math.OC·April 17, 2019

Golden ratio algorithms with new stepsize rules for variational inequalities

Dang Van Hieu, Yeol Je Cho, and Yi-bin Xiao

PDF

Open Access

TL;DR

This paper introduces two novel golden ratio algorithms with adaptive stepsize rules for solving variational inequalities, eliminating the need for prior knowledge of the Lipschitz constant and demonstrating effective convergence and performance.

Contribution

The paper presents two new golden ratio algorithms with adaptive stepsize rules for variational inequalities, improving upon existing methods by removing the need for Lipschitz constant knowledge.

Findings

01

Algorithms converge under standard conditions.

02

The second algorithm's stepsizes stay separated from zero.

03

Numerical results show competitive performance.

Abstract

In this paper, we introduce two golden ratio algorithms with new stepsize rules for solving pseudomonotone and Lipschitz variational inequalities in finite dimensional Hilbert spaces. The presented stepsize rules allow the resulting algorithms to work without the prior knowledge of the Lipschitz constant of operator. The first algorithm uses a sequence of stepsizes which is previously chosen, diminishing and non-summable. While the stepsizes in the second one are updated at each iteration and by a simple computation. A special point is that the sequence of stepsizes generated by the second algorithm is separated from zero. The convergence as well as the convergence rate of the proposed algorithms are established under some standard conditions. Also, we give several numerical results to show the behavior of the algorithms in comparisons with other algorithms.

Equations170

\mbox F in d x^{*} \in ℜ^{m} \mbox s u c h t ha t ⟨ F x^{*}, x - x^{*} ⟩ + g (x) - g (x^{*}) \geq 0, \forall x \in ℜ^{m},

\mbox F in d x^{*} \in ℜ^{m} \mbox s u c h t ha t ⟨ F x^{*}, x - x^{*} ⟩ + g (x) - g (x^{*}) \geq 0, \forall x \in ℜ^{m},

{\rm prog}_{g}(x)=\arg\min_{y\in\Re^{m}}\Big{\{}g(y)+\frac{1}{2}||y-x||^{2}\Big{\}}.

{\rm prog}_{g}(x)=\arg\min_{y\in\Re^{m}}\Big{\{}g(y)+\frac{1}{2}||y-x||^{2}\Big{\}}.

\mbox F in d x^{*} \in C \mbox s u c h t ha t ⟨ F x^{*}, x - x^{*} ⟩ \geq 0, \forall x \in C .

\mbox F in d x^{*} \in C \mbox s u c h t ha t ⟨ F x^{*}, x - x^{*} ⟩ \geq 0, \forall x \in C .

x \in ℜ^{m} min y \in ℜ^{n} max L (x, y) = g_{1} (x) + K (x, y) - g_{2} (y),

x \in ℜ^{m} min y \in ℜ^{n} max L (x, y) = g_{1} (x) + K (x, y) - g_{2} (y),

x \in ℜ^{m} min J (x) = f (x) + g (x),

x \in ℜ^{m} min J (x) = f (x) + g (x),

\mbox (H1) n \to \infty lim λ_{n} = 0, \mbox (H2) n = 1 \sum \infty λ_{n} = + \infty, \mbox (H3) n \to \infty lim inf \frac{λ _{n}}{λ _{n - 1}} > 0.

\mbox (H1) n \to \infty lim λ_{n} = 0, \mbox (H2) n = 1 \sum \infty λ_{n} = + \infty, \mbox (H3) n \to \infty lim inf \frac{λ _{n}}{λ _{n - 1}} > 0.

\left\{\begin{array}[]{ll}\bar{x}_{n}=\frac{(\varphi-1)x_{n}+\bar{x}_{n-1}}{\varphi},\\ x_{n+1}=\mbox{\rm prox}_{\lambda_{n}g}(\bar{x}_{n}-\lambda_{n}F(x_{n})).\end{array}\right.

\left\{\begin{array}[]{ll}\bar{x}_{n}=\frac{(\varphi-1)x_{n}+\bar{x}_{n-1}}{\varphi},\\ x_{n+1}=\mbox{\rm prox}_{\lambda_{n}g}(\bar{x}_{n}-\lambda_{n}F(x_{n})).\end{array}\right.

⟨ F (x), x - y ⟩ \geq 0 ⟹ ⟨ F (y), x - y ⟩ \geq γ ∣∣ x - y ∣ ∣^{2}, \forall x, y \in ℜ^{m};

⟨ F (x), x - y ⟩ \geq 0 ⟹ ⟨ F (y), x - y ⟩ \geq γ ∣∣ x - y ∣ ∣^{2}, \forall x, y \in ℜ^{m};

∣∣ F (x) - F (y) ∣∣ \leq L ∣∣ x - y ∣∣, \forall x, y \in ℜ^{m} .

∣∣ F (x) - F (y) ∣∣ \leq L ∣∣ x - y ∣∣, \forall x, y \in ℜ^{m} .

\overset{x}{ˉ} = prog_{g} (z) ⟺ ⟨ \overset{x}{ˉ} - z, x - \overset{x}{ˉ} ⟩ \geq g (\overset{x}{ˉ}) - g (x), \forall x \in ℜ^{m} .

\overset{x}{ˉ} = prog_{g} (z) ⟺ ⟨ \overset{x}{ˉ} - z, x - \overset{x}{ˉ} ⟩ \geq g (\overset{x}{ˉ}) - g (x), \forall x \in ℜ^{m} .

∣∣ α x + (1 - α) y ∣ ∣^{2} = α ∣∣ x ∣ ∣^{2} + (1 - α) ∣∣ y ∣ ∣^{2} - α (1 - α) ∣∣ x - y ∣ ∣^{2}, \forall x, y \in ℜ^{m}, α \in ℜ.

∣∣ α x + (1 - α) y ∣ ∣^{2} = α ∣∣ x ∣ ∣^{2} + (1 - α) ∣∣ y ∣ ∣^{2} - α (1 - α) ∣∣ x - y ∣ ∣^{2}, \forall x, y \in ℜ^{m}, α \in ℜ.

⟨ x_{n + 1} - \overset{x}{ˉ}_{n} + λ_{n} F (x_{n}), x - x_{n + 1} ⟩ \geq λ_{n} (g (x_{n + 1}) - g (x)), \forall x \in ℜ^{m},

⟨ x_{n + 1} - \overset{x}{ˉ}_{n} + λ_{n} F (x_{n}), x - x_{n + 1} ⟩ \geq λ_{n} (g (x_{n + 1}) - g (x)), \forall x \in ℜ^{m},

⟨ x_{n + 1} - \overset{x}{ˉ}_{n} + λ_{n} F (x_{n}), x^{†} - x_{n + 1} ⟩ \geq λ_{n} (g (x_{n + 1}) - g (x^{†})) .

⟨ x_{n + 1} - \overset{x}{ˉ}_{n} + λ_{n} F (x_{n}), x^{†} - x_{n + 1} ⟩ \geq λ_{n} (g (x_{n + 1}) - g (x^{†})) .

2 ⟨ x_{n + 1} - \overset{x}{ˉ}_{n}, x^{†} - x_{n + 1} ⟩ + 2 λ_{n} ⟨ F (x_{n}), x^{†} - x_{n + 1} ⟩ \geq 2 λ_{n} (g (x_{n + 1}) - g (x^{†})) .

2 ⟨ x_{n + 1} - \overset{x}{ˉ}_{n}, x^{†} - x_{n + 1} ⟩ + 2 λ_{n} ⟨ F (x_{n}), x^{†} - x_{n + 1} ⟩ \geq 2 λ_{n} (g (x_{n + 1}) - g (x^{†})) .

∣∣ x_{n + 1} - x^{†} ∣ ∣^{2}

∣∣ x_{n + 1} - x^{†} ∣ ∣^{2}

⟨ x_{n} - \overset{x}{ˉ}_{n - 1} + λ_{n - 1} F (x_{n - 1}), x - x_{n} ⟩ \geq λ_{n - 1} (g (x_{n}) - g (x)), \forall x \in ℜ^{m} .

⟨ x_{n} - \overset{x}{ˉ}_{n - 1} + λ_{n - 1} F (x_{n - 1}), x - x_{n} ⟩ \geq λ_{n - 1} (g (x_{n}) - g (x)), \forall x \in ℜ^{m} .

⟨ x_{n} - \overset{x}{ˉ}_{n - 1} + λ_{n - 1} F (x_{n - 1}), x_{n + 1} - x_{n} ⟩ \geq λ_{n - 1} (g (x_{n}) - g (x_{n + 1})) .

⟨ x_{n} - \overset{x}{ˉ}_{n - 1} + λ_{n - 1} F (x_{n - 1}), x_{n + 1} - x_{n} ⟩ \geq λ_{n - 1} (g (x_{n}) - g (x_{n + 1})) .

2 \frac{φ λ _{n}}{λ _{n - 1}} ⟨ x_{n} - \overset{x}{ˉ}_{n}, x_{n + 1} - x_{n} ⟩ + 2 λ_{n} ⟨ F (x_{n - 1}), x_{n + 1} - x_{n} ⟩ \geq 2 λ_{n} (g (x_{n}) - g (x_{n + 1})) .

2 \frac{φ λ _{n}}{λ _{n - 1}} ⟨ x_{n} - \overset{x}{ˉ}_{n}, x_{n + 1} - x_{n} ⟩ + 2 λ_{n} ⟨ F (x_{n - 1}), x_{n + 1} - x_{n} ⟩ \geq 2 λ_{n} (g (x_{n}) - g (x_{n + 1})) .

0

0

∣∣ x_{n + 1} - x^{†} ∣ ∣^{2}

∣∣ x_{n + 1} - x^{†} ∣ ∣^{2}

2 ⟨ F (x_{n}) - F (x_{n - 1}), x_{n} - x_{n + 1} ⟩

2 ⟨ F (x_{n}) - F (x_{n - 1}), x_{n} - x_{n + 1} ⟩

⟨ F (x^{†}), x_{n} - x^{†} ⟩ + g (x_{n}) - g (x^{†}) \geq 0.

⟨ F (x^{†}), x_{n} - x^{†} ⟩ + g (x_{n}) - g (x^{†}) \geq 0.

∣∣ x_{n + 1} - x^{†} ∣ ∣^{2}

∣∣ x_{n + 1} - x^{†} ∣ ∣^{2}

∣∣ x_{n + 1} - x^{†} ∣ ∣^{2}

∣∣ x_{n + 1} - x^{†} ∣ ∣^{2}

\frac{φ}{φ - 1} ∣∣ \overset{x}{ˉ}_{n + 1} - x^{†} ∣ ∣^{2}

\frac{φ}{φ - 1} ∣∣ \overset{x}{ˉ}_{n + 1} - x^{†} ∣ ∣^{2}

1 + \frac{1}{φ} - \frac{φ λ _{n}}{λ _{n - 1}} \geq 1 + \frac{1}{φ} - φ = 0.

1 + \frac{1}{φ} - \frac{φ λ _{n}}{λ _{n - 1}} \geq 1 + \frac{1}{φ} - φ = 0.

\frac{φ}{λ _{n - 1}} - 2 L > 0, \forall n \geq n_{0} .

\frac{φ}{λ _{n - 1}} - 2 L > 0, \forall n \geq n_{0} .

\frac{φ λ _{n}}{λ _{n - 1}} - λ_{n} L - λ_{n + 1} L \geq \frac{φ λ _{n}}{λ _{n - 1}} - λ_{n} L - λ_{n} L = λ_{n} [\frac{φ}{λ _{n - 1}} - 2 L] > 0, \forall n \geq n_{0},

\frac{φ λ _{n}}{λ _{n - 1}} - λ_{n} L - λ_{n + 1} L \geq \frac{φ λ _{n}}{λ _{n - 1}} - λ_{n} L - λ_{n} L = λ_{n} [\frac{φ}{λ _{n - 1}} - 2 L] > 0, \forall n \geq n_{0},

\frac{φ λ _{n}}{λ _{n - 1}} - λ_{n} L > λ_{n + 1} L, \forall n \geq n_{0} .

\frac{φ λ _{n}}{λ _{n - 1}} - λ_{n} L > λ_{n + 1} L, \forall n \geq n_{0} .

\frac{φ}{φ - 1} ∣∣ \overset{x}{ˉ}_{n + 1} - x^{†} ∣ ∣^{2}

\frac{φ}{φ - 1} ∣∣ \overset{x}{ˉ}_{n + 1} - x^{†} ∣ ∣^{2}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Optimization Algorithms Research · Optimization and Variational Analysis · Mathematical Inequalities and Applications

Full text

Golden ratio algorithms with new stepsize rules for variational inequalities

\nameDang Van Hieua, Yeol Je Chob,c and Yi-bin Xiaoc Dang Van Hieu. Email: [email protected] aApplied Analysis Research Group, Faculty of Mathematics and Statistics, Ton Duc Thang University, Ho Chi Minh City, Vietnam;

bDepartment of Mathematics Education, Gyeongsang National University, Jinju 52828, Korea;

cSchool of Mathematical Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan, 611731, P.R. China

Dedicated to Professor Pham Ky Anh on the Occasion of his 70th Birthday

Abstract

In this paper, we introduce two golden ratio algorithms with new stepsize rules for solving pseudomonotone and Lipschitz variational inequalities in finite dimensional Hilbert spaces. The presented stepsize rules allow the resulting algorithms to work without the prior knowledge of the Lipschitz constant of operator. The first algorithm uses a sequence of stepsizes which is previously chosen, diminishing and non-summable. While the stepsizes in the second one are updated at each iteration and by a simple computation. A special point is that the sequence of stepsizes generated by the second algorithm is separated from zero. The convergence as well as the convergence rate of the proposed algorithms are established under some standard conditions. Also, we give several numerical results to show the behavior of the algorithms in comparisons with other algorithms.

keywords:

Variational inequality; Pseudomonotone operator; Lipschitz continuity; Projection method.

1 Introduction

In this paper, we focus on the following variational inequality problem (shortly, (VIP)):

[TABLE]

where $g:\Re^{m}\to(-\infty,+\infty]$ is a proper convex lower semi-continuous function with the domain ${\rm dom}\,g=\left\{x\in\Re^{m}:g(x)<+\infty\right\}$ and $F:{\rm dom}\,g\to\Re^{m}$ is an operator. The function $g$ here cannot be smooth. Recall that the proximal operator ${\rm prox}_{g}$ of $g$ is defined by

[TABLE]

We are also interested here in the problem (VIP) where the proximal mapping of $g$ is computable. The problem (VIP) is known as a central problem in nonlinear analysis, especially, in optimization, control theory, games theory [2, 3, 4, 1, 5, 6, 7, 8] and other fields [14, 13, 12, 11, 9, 10, 15, 16, 17]. Considering the problem (VIP) in a special case, when $g=\delta_{C}$ , the indicator operator of a nonempty closed convex set $C$ in $\Re^{m}$ , this problem reduces to the classical variational inequality problem [5, 18]:

[TABLE]

Moreover, the motivations of studying the problem (VIP) come from optimization point of view. Several models arising naturally can be formulated as the problem (VIP), see, e.g, in [3, 5, 6]. We restrict our interest in the following two problems. The first basic problem is a convex-concave saddle point problem:

[TABLE]

where $x\in\Re^{m}$ , $y\in\Re^{n}$ , $g_{1}:\Re^{m}\to(-\infty,+\infty]$ , $g_{2}:\Re^{n}\to(-\infty,+\infty]$ are proper convex lower semi-continuous functions, $K:{\rm dom}\,g_{1}\times{\rm dom}\,g_{2}\to\Re$ is a smooth convex-concave function. The saddle point problem (1) can be considered the problem (VIP) with $g(z)=g_{1}(x)+g_{2}(y)$ , $F(z)=\left[\nabla_{x}K(x,y);-\nabla_{y}K(x,y)\right]^{T}$ and $z=(x,y)\in\Re^{m+n}$ . This is a type of example for nonsmooth problem, where the gradient method (one step) [19] cannot work. The early proposed iterative methods for this problem may be the extragradient methods (two steps) [20, 21]. Recently, many works on the problem (VIP) and related problems have been devoted to proposing different projection-like methods under various types of conditions, such as the subgradient extragradient methods [22, 23, 24, 25, 26], the modified extragradient method [27], the projected reflected gradient method [28, 29] and others [30, 31, 32, 33, 34, 35, 36, 37, 38, 39].

Another model arising in the signal processing literature is a nonsmooth convex optimization model,

[TABLE]

where $x\in\Re^{m}$ and $f,~{}g:\Re^{m}\to\Re$ with $g$ possibly nonsmooth. This model is equivalent to the problem (VIP) with $F=\nabla f$ . Applying generic variational methods to solve special models as the optimization problem (2) is not a good choice. Optimization methods in general have better theoretical convergence rates because they can exploit the characteristics of the potential operator $\nabla f$ . However, this seems only true when $\nabla f$ is Lipschitz continuous. Without such a condition, the proximal gradient methods cannot hold anymore. Recently, in the case with non-Lipschitz $\nabla f$ , some notable optimization methods can be found in, for example, in [40, 41, 42]. Especially, the so-called NoLips algorithm developed in [40] is an interesting and promising method which can use a fixed stepsize when $\nabla f$ is non-Lipschitz continuous. However, the obtained results are not generic because they depend strictly on problem instances and used Bregman distances.

Methods for solving the problem (VIP) without the Lipschitz continuity of $F$ often use a linesearch proceduce which runs in each iteration of the algorithm until a stopping criterion is satisfied. Linesearch methods are thus time-consuming because they require many computations of values of $F$ as well as projections onto feasible set. Besides, the estimates of complexity in linesearch methods become not so informative. This is clear because they only show how many outer iterations needed to be done to obtain the desired accuracy while the number of inner linesearch iterations cannot be mentioned. Moreover, in the case $F$ is even Lipschitz continuous, but in general the Lipschitz constant is often unknown and in nonlinear problems, it can be difficult to approximate. Recently, without any linesearch procedure, some interesting methods for solving the classical variational inequalities can be found, for instance, in [43, 6, 44] where stepsizes are updated over iteration by some cheap computations.

Especially, an interesting idea has been developed recently by Malitsky in [6]. He has proposed a new algorithm, named the explicit golden ratio algorithm (shortly, EGRAAL), for solving problem (VIP) when $F$ is locally Lipschitz continuous. The EGRAAL has a simple and elegant structure and only requires the computations of one proximal mapping value (with function $g$ ) and one value of $F$ . His algorithm is explicit in the sense that it does not require any linesearch procedure. Its stepsizes are computed explicitly at each iteration from the previous iterates. The theoretical and numerical results in [6] are promising and also suggest some directions for studying in the future (see, [6, Sects. 5 and 6]).

In this paper, motivated by the results in [43, 6, 44], we propose two different golden ratio algorithms with new stepsize rules for solving pseudomonotone problem (VIP) with a Lipschitz condition. The stepsize strategies here are simpler than those ones in [6]. The variable stepsizes in the new algorithms are chosen previously or computed easily. Also, the resulting algorithms work without any information of Lipschitz constant of operator, i.e., the Lipschitz constant must not be the input parameters of the algorithms. More precisely, we first consider a golden ratio algorithm with a priorly taken sequence of stepsizes being diminishing and non-summable. Following to this strategy, the algorithm works well for problem (VIP) with the strong pseudomonotonicity of $F$ . For a weaker assumption of pseudomonotonicity of $F$ , we present the second golden ratio algorithm incorporated with variable stepsizes updating step-by-step. The convergence as well as the convergence rate of the proposed algorithms are established. Finally, the theoretical results are confirmed by several our numerical experiments in comparisons with other known algorithms.

The remainder of this paper is organized as follows: In Section 2, we introduce a golden ratio algorithm with a sequence of stepsizes priorly chosen. Section 3 deals with another golden ratio algorithm with a simpler stepsize rule. Finally, in Section 4, we study the numerical behaviour of the new algorithms on two test problems and compare them with others.

2 A golden ratio algorithm with diminishing stepsizes

In this section, we present a golden ratio algorithm with a sequence of stepsizes being diminishing and non-summable. The use of this new stepsize rule allows the algorithm to work without previously knowing the Lipschitz constant of the operator. We set $\varphi=\frac{\sqrt{5}+1}{2}$ , which is called the golden ratio. Moreover, we denote $VI(F,g)$ the solution set of the problem (VIP) and it is assumed to be nonempty. The following is the algorithm in details:

*Algorithm 2.1** (Golden Ratio Algorithm with diminishing stepsizes).*

.

Initialization: Choose $x_{1},~{}\bar{x}_{0}\in\Re^{m}$ and a non-increasing sequence of stepsizes $\left\{\lambda_{n}\right\}\subset(0,+\infty)$ such that the following conditions hold:

[TABLE]

Iterative Steps: Assume that $x_{n},~{}\bar{x}_{n-1}$ are known, calculate $x_{n+1}$ as follows:

[TABLE]

An example for the sequence $\left\{\lambda_{n}\right\}$ satisfying the conditions (H1)-(H3) is $\lambda_{n}=\frac{1}{n^{p}}$ with $0<p\leq 1$ . In order to establish the convergence of Algorithm 2.1, we assume that the operator $F:{\rm dom}g\to\Re^{m}$ satisfies the following conditions:

(SP) $F$ is strongly pseudomonotone, i.e., there exists $\gamma>0$ such that

[TABLE]

(LC) $F$ is Lipschitz continuous, i.e., there exists $L>0$ such that

[TABLE]

However, it is not necessary to know the two constants $\gamma$ and $L$ . The unique solution of the problem (VIP) is denoted by $x^{\dagger}$ .

We need the following two lemmas to prove the convergence of Algorithm 2.1:

Lemma 2.2.

[45, Proposition 12.26]* We have*

[TABLE]

Lemma 2.3.

[45, Corollary 2.14]* We have*

[TABLE]

Now, we have the following first main result:

Theorem 2.4.

Under the hypotheses (SP) and (LC), the sequence $\left\{x_{n}\right\}$ generated by Algorithm 2.1 converges to the unique solution $x^{\dagger}$ of the problem (VIP).

Proof.

It follows from the definition of $x_{n+1}$ and Lemma 2.2 that

[TABLE]

which, with $x=x^{\dagger}$ , implies that

[TABLE]

Thus we have

[TABLE]

This together with the equality $2\left\langle a,b\right\rangle=||a+b||^{2}-||a||^{2}+||b||^{2}$ and the hypothesis (SP) imply that

[TABLE]

Now, using the relation (3) with $n=n-1$ , we obtain

[TABLE]

Substituting $x=x_{n+1}$ into the inequality (5), we get

[TABLE]

Multiplying both sides of (6) by $\frac{2\lambda_{n}}{\lambda_{n-1}}>0$ and noting that $x_{n}-\bar{x}_{n-1}=\varphi(x_{n}-\bar{x}_{n})$ , we get

[TABLE]

Thus, using the identity $2\left\langle a,b\right\rangle=||a+b||^{2}-||a||^{2}+||b||^{2}$ , we have the following estimate:

[TABLE]

Adding both sides of the relations (4) and (8), we obtain

[TABLE]

Using the Lipschitz continuity of $F$ , we derive

[TABLE]

Since $x^{\dagger}$ is a solution of the problem (VIP), we have

[TABLE]

Combining the relations (9)-(11), we get

[TABLE]

Moreover, from the definition of $\bar{x}_{n}$ and Lemma 2.3, it follows that

[TABLE]

Combining the relations (LABEL:h6c) and (13), we obtain

[TABLE]

Since the sequence $\left\{\lambda_{n}\right\}$ is non-increasing, we obtain

[TABLE]

Since $\lambda_{n}\to 0$ , there exists $n_{0}\geq 1$ such that

[TABLE]

Thus, from $\lambda_{n+1}\leq\lambda_{n}$ , it follows that

[TABLE]

which follows that

[TABLE]

From the relations (LABEL:h8), (15) and (16), it follows that, for all $n\geq n_{0}$ ,

[TABLE]

or

[TABLE]

where

[TABLE]

Thus the sequence $\left\{a_{n}\right\}_{n\geq n_{0}}$ is non-increasing. It is obvious that $\left\{a_{n}\right\}_{n\geq n_{0}}$ is bounded from below by [math]. Thus the limit $\lim\limits_{n\to\infty}a_{n}$ exists and $\lim\limits_{n\to\infty}a_{n}\in\Re$ . This implies that $\left\{\bar{x}_{n}\right\}$ is bounded. Thus, from the definition of $\bar{x}_{n}$ , we also see that the sequence $\left\{x_{n}\right\}$ is bounded. This together with the fact $\lambda_{n}\to 0$ implies that $\lambda_{n}L||x_{n}-x_{n-1}||^{2}\to 0$ as $n\to\infty$ . Hence, from the definition of $a_{n}$ , it follows that

[TABLE]

Also, from the relation (17), we obtain $\sum\limits_{n=n_{0}}^{\infty}b_{n}<+\infty$ . Thus, from the definition of $b_{n}$ , we obtain

[TABLE]

It follows from the conditions (H3) and (S1) that $||\bar{x}_{n}-x_{n}||^{2}\to 0$ as $n\to\infty$ . Thus, from the relation (18), we get

[TABLE]

On the other hand, from (H2) and (S2), we obtain that $\liminf\limits_{n}||x_{n}-x^{\dagger}||^{2}=0$ . This together with the relation (19) implies that $\lim\limits_{n\to\infty}||x_{n}-x^{\dagger}||^{2}=0$ or the sequence $\left\{x_{n}\right\}$ converges to $x^{\dagger}$ , which solves uniquely the problem (VIP). This completes the proof. ∎

*Remark 1**.*

It follows from (S2) that $\lambda_{n}||x_{n}-x^{\dagger}||^{2}<\frac{1}{n}$ . Thus, if we choose $\lambda_{n}=\frac{1}{n^{p}}$ (with $0<p<1$ ), we obtain an estimate of the convergence rate of the sequence $\left\{x_{n}\right\}$ generated by Algorithm 2.1 that

[TABLE]

3 Golden Ratio Algorithm without diminishing stepsizes

In this section, we introduce a simple stepsize rule where the stepsizes will be updated over each iteration and only uses the information on the data, the previous approximations without the prior knowledge of Lipschitz constant. Unlike the previous section, the stepsizes generated by the next algorithm are bounded from below by a positive constant. Another stepsize rule for a golden ratio algorithm can be found in [6]. For the sake of simplicity, we adopt the convention $\frac{0}{0}=+\infty$ .

Now, we describe the algorithm in details as follows:

*Algorithm 3.1** (Golden Ratio Algorithm without diminishing stepsizes).*

.

Initialization: Choose $\bar{x}_{0},~{}x_{1},~{}x_{0}\in\Re^{m}$ , $\lambda_{0}>0$ , $\mu\in\left(0,\frac{\varphi}{2}\right)$

Iterative Steps: Assume that $\bar{x}_{n-1},~{}x_{n-1},~{}x_{n}$ are known, calculate $x_{n+1}$ as follows:

[TABLE]

where

[TABLE]

*Remark 2**.*

It follows from the definition of $\left\{\lambda_{n}\right\}$ that this sequence is non-increasing. Moreover, from the Lipschitz continuity of $F$ and in the case $F(x_{n})\neq F(x_{n-1})$ , we see that

[TABLE]

Thus we obtain by the induction that $\lambda_{n}\geq\min\left\{\lambda_{0},\frac{\mu}{L}\right\}$ and so there exists $\lambda>0$ such that

[TABLE]

3.1 The convergence of Algorithm 3.1

In this subsection, we study the convergence of Algorithm 3.1. We weaken the assumptions imposed on the cost operator $F$ where it only need to satisfy the above condition (LC) and the following pseudomonotone condition (PC):

(PC) $F$ is pseudomonotone, i.e., the following implication holds:

[TABLE]

We have the following second result:

Theorem 3.2.

Under the conditions (LC) and (PC), the sequence $\left\{x_{n}\right\}$ generated by Algorithm 3.1 converges to a solution of the problem (VIP).

Proof.

By arguing similarly to the relation (9) with $F$ being pseudomonotone and $x^{*}\in VI(F,g)$ , we obtain

[TABLE]

where with noting in (9) that $\left\langle F(x^{*}),x_{n}-x^{*}\right\rangle+g(x_{n})-g(x^{*})\geq 0$ . From the definition of $\lambda_{n}$ , we see that

[TABLE]

We have the following fact (see, the relation (13)):

[TABLE]

Combining the relations (21) - (23), we get

[TABLE]

Since $\left\{\lambda_{n}\right\}$ is non-increasing, we get

[TABLE]

Since $\mu\in\left(0,\frac{\varphi}{2}\right)$ , it follows from the relation (20) that

[TABLE]

Therefore, there exists $n_{0}\geq 1$ such that

[TABLE]

From the relations (24)-(26), we derive

[TABLE]

or

[TABLE]

where

[TABLE]

Thus the limit of $\left\{\bar{a}_{n}\right\}$ exists and $\sum\limits_{n=n_{0}}^{\infty}\bar{b}_{n}<+\infty$ . Hence the sequences $\left\{\bar{x}_{n}\right\}$ and $\left\{x_{n}\right\}$ are bounded. Morever, we also see that $\sum\limits_{n=n_{0}}^{\infty}\frac{\varphi\lambda_{n}}{\lambda_{n-1}}||x_{n}-\bar{x}_{n}||^{2}<+\infty$ , which, together with the relation (20), implies that

[TABLE]

Thus, since $x_{n}-\bar{x}_{n-1}=\varphi(x_{n}-\bar{x}_{n})$ , we obtain

[TABLE]

It is obvious that $||x_{n+1}-\bar{x}_{n}||^{2}\to 0$ as $n\to\infty$ . This together with (28) implies that

[TABLE]

From the definition of $x_{n}$ and Lemma 2.2, we see that

[TABLE]

Now, assume that $p$ is a cluster point of $\left\{x_{n}\right\}$ , i.e., there exists a subsequence $\left\{x_{k}\right\}$ of $\left\{x_{n}\right\}$ converging to $p$ . Passing to the limit in (31) when $n=k\to\infty$ and using (20) and (29), we obtain

[TABLE]

which says that $p$ is a solution of the problem (VIP).

Now, in order to finish the proof, we prove that the whole sequence $\left\{x_{n}\right\}$ converges to $p$ . Indeed, assume that $\left\{x_{l}\right\}$ is another subsequence of $\left\{x_{n}\right\}$ converges to $\bar{p}\neq p$ . Note that, as mentioned above, $\bar{p}$ is also a solution of the problem (VIP). Since $\lim\limits_{n\to\infty}\bar{a}_{n}\in\Re$ and the relation (30), it follows that $\lim\limits_{n\to\infty}||\bar{x}_{n}-x^{*}||^{2}\in\Re$ and thus $\lim\limits_{n\to\infty}||x_{n}-x^{*}||^{2}\in\Re$ for each $x^{*}\in VI(F,g)$ . We have the following equality:

[TABLE]

Thus, since $\lim\limits_{n\to\infty}||x_{n}-p||^{2}\in\Re$ and $\lim\limits_{n\to\infty}||x_{n}-\bar{p}||^{2}\in\Re$ , we obtain that $\lim\limits_{n\to\infty}\left\langle x_{n},p-\bar{p}\right\rangle\in\Re$ . Set

[TABLE]

Now, passing to the limit in (33) as $n=k,~{}l\to\infty$ , we obtain

[TABLE]

Thus $||p-\bar{p}||^{2}=0$ or $\bar{p}=p$ . This completes the proof. ∎

3.2 The convergence rate of Algorithm 3.1

This subsection deals with the convergence rate of Algorithm 3.1. In order to get the rate of convergence, we choose $\mu$ in Algorithm 3.1 such that $0<\mu<\frac{\rho}{1+\rho}\varphi$ with $\rho\in\left(0,\frac{1}{\sqrt{5}}\right)$ and assume that the operator $F$ satisfies the aforementioned conditions (SP) and (LC).

Finally, we study the convergence rate of Algorithm 3.1.

Theorem 3.3.

Under the conditions (SP) and (LC), the sequence $\left\{x_{n}\right\}$ generated by Algorithm 3.1 converges at least linearly to the unique solution $x^{\dagger}$ of the problem (VIP).

Proof.

By arguing similarly to the relation (LABEL:h8), we have

[TABLE]

Thus we have

[TABLE]

Note that, from the definition of $\bar{x}_{n}$ , we obtain

[TABLE]

Thus it follows from Lemma 2.3 that

[TABLE]

From the relations (35) and (36), we see that

[TABLE]

in which the last inequality follows from the following:

[TABLE]

Now, if we set $a_{n}=\frac{\varphi}{\varphi-1}||\bar{x}_{n}-x^{\dagger}||^{2}$ , then the inequality (37) can be rewritten as follows:

[TABLE]

Let $\beta\in(1/\varphi,1)$ be fixed. From the relation (20) and the fact $0<\mu<\frac{\rho\varphi}{1+\rho}$ , we have

[TABLE]

Thus there exists $n_{0}\geq 1$ such that

[TABLE]

Combining the relations (38) and (39) and noting that $\lambda_{n}\geq\lambda$ , we obtain

[TABLE]

Set $b_{n}=\frac{\mu}{\rho}||x_{n}-x_{n-1}||^{2}$ and $\alpha=2\gamma\lambda$ . Then we can rewrite the relation (40) as follows:

[TABLE]

On the other hand, since $F$ is strongly monotone, we get

[TABLE]

Thus we have $\gamma\lambda_{n}\leq\mu$ or $\alpha=2\gamma\lambda_{n}\leq 2\mu<\frac{2\rho\varphi}{1+\rho}<1$ . Let $r_{1}>0$ and $r_{2}>0$ . Now, we can rewrite the relation (41) as follows:

[TABLE]

Choose $r_{1}>0$ and $r_{2}>0$ such that $1-\alpha-r_{2}+r_{1}=0$ and $\alpha\beta-r_{1}r_{2}=0$ . Thus we have

[TABLE]

Consider the function

[TABLE]

Then we have

[TABLE]

because of $\frac{1}{\varphi}<\beta<1$ . Thus $f(t)$ is non-increasing on $(0,1)$ . Hence we have $0<r_{2}=f(\alpha)<f(0)=1$ . Now, set $\theta=\max\left\{\rho,r_{2}\right\}$ and note $\theta\in(0,1)$ . Then it follows from the relation (42) that

[TABLE]

Thus, by induction, we obtain

[TABLE]

and so we can reduce that

[TABLE]

or

[TABLE]

where $M=\frac{(\varphi-1)(a_{n_{0}}+r_{1}a_{n_{0}-1}+b_{n_{0}})}{\varphi\theta^{n_{0}-1}}$ . This completes the proof. ∎

4 Numerical experiments

In this section, we perform several experiments to show the numerical behaviour of the proposed algorithm (Agorithm 3.1) in comparison with other algorithms. All the programs are written in Matlab 7.0 and computed on a PC Desktop Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz, RAM 2.00 GB.

Example 4.1 In this example, our problem of interest is a sparse logistic regression which is a popular problem in machine learning applications:

[TABLE]

where $x\in\Re^{m}$ , $a_{i}\in\Re^{m}$ , $b_{i}\in\left\{-1,1\right\}$ , $\gamma>0$ .

Let $K$ be a matrix of size $N\times m$ defined by $K_{ij}=-b_{i}a_{ij}$ and $\bar{f}(y)=\sum_{i=1}^{N}\log\left(1+\exp(y_{i})\right)$ . Then, the objective function in our problem is $J(x)=f(x)+g(x)$ with $f(x)=\bar{f}(Kx)$ and $g(x)=\gamma||x||_{1}$ . This problem is equivalent the considered problem with $F=\nabla f$ . We compare Algorithm 3.1 (GRADS) with EGRAAL in [6] and FISTA with constant stepsize in [1]. We do not include the algorithms with linesearch procedures because they require many computations over each iteration which is time-consuming. Note that the algorithm FISTA requires the Lipschitz constant of $F$ ( $L_{\nabla f}=||K^{T}K||/4$ ) while other algorithms are not. All entries of $a_{i}$ and $b_{i}$ are generated randomly and we choose $\gamma=0.005||A^{T}b||_{\infty}$ . We choose $\mu=0.45\varphi$ , $\lambda_{0}=1$ for Algorithm 3.1 (GRADS); $\lambda_{0}=\bar{\lambda}=1$ , $\phi=\varphi$ for the algorithm ERGAAL. The starting points are generated randomly in $(0,1]$ . The mapping prox is computed by the function fmincon in Matlab. The results are shown on Figure 2 and Figure 2. The execution times for the algorithms are almost equivalent. In these figures, $J_{*}$ is the most minimum value of $J$ generated by all the algorithms with the stopping criterion $||x_{n}-{\rm prox}_{g}(x_{n}-\nabla f(x_{n}))||\leq 10^{-3}$ .

Example 4.2 Consider the nonlinear problem, which is presented by Sun [46], for the operator $A:\Re^{m}\to\Re^{m}$ of the form:

[TABLE]

where $c=(-1,-1,\cdots,-1)$ and $D$ is a square matrix of order $m$ , which is given by

[TABLE]

The feasible set is $C=\Re^{m}_{+}$ . This problem is equivalent to the considered problem with $g=\delta_{C}$ . In this case, the mapping prox is the projection on the set $C$ and it is computed by the function quadprog in Matlab 7.0. Since the Lipschitz constant of $F$ is unknown, we do not include the comparison with the algorithm FISTA. We use the sequence $D_{n}=||x_{n+1}-\bar{x}_{n}||^{2}+||\bar{x}_{n}-x_{n}||^{2}$ for each $n=0,1,2,\cdots$ to compare the computational performance of the algorithms. Figure 4 and Figure 4 describe the results in this example.

The numerical results here have illustrated that the proposed algorithm works well and also has competitive advantage over other algorithms.

5 Conclusions

In this paper, we have introduced the two golden ratio algorithms with two simple stepsize rules for solving pseudomonotone and Lipschitz variational inequalities in finite dimensional Hilbert spaces. The first algorithm uses a sequence of stepsizes taken priorly with some suitable properties while the second one itself generates variable stepsizes which are explicitly computed in each iteration and without a linesearch procedure to be run. We have established the convergence as well as the convergence rate of the new algorithms under appropriate conditions. The theoretical results have been illustrated by some our numerical experiments.

Our results can be extended to many promising directions, such as multi-value variational inequalities [47], equilibrium problems [49, 48, 50, 51], problem (VIP) incorporated with fixed point problems, systems of variational inequalities and mixed equilibrium problems [30, 26, 32, 38, 39], weak and strong convergence in Hilbert spaces as well as extensions to Banach spaces [33, 52]. This is surely our future goals.

Acknowledgement

The authors would like to thank the Associate Editor and the two anonymous referees for their valuable comments and suggestions which helped us very much in improving the original version of this paper. The research of the first author was supported by the National Foundation for Science and Technology Development (NAFOS-TED) of Vietnam under grant number 101.01-2017.315. The research work was also supported by the National Natural Science Foundation of China (11771067) and the Applied Basic Project of Sichuan Province (19YYJC0157). We also would like to thank Dr. Yura Malitsky for sending us the paper [6].

Bibliography52

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Beck A, Teboulle M. A fast iterative shrinkage-thresholding algorithm for linear inverse problem. SIAM Journal on Imaging Sciences 2009; 2 :183-202.
2[2] Daniele P, Giannessi F, Maugeri A. Equilibrium Problems and Variational Models . Springer: Kluwer, 2003.
3[3] Facchinei F, Pang JS. Finite-Dimensional Variational Inequalities and Complementarity Problems . Springer: Berlin, 2002.
4[4] Giannessi F, Maugeri A, Pardalos PM. Equilibrium Problems: Nonsmooth Optimization and Variational Inequality Models . Springer: Dordrecht, 2004.
5[5] Kinderlehrer D, Stampacchia G. An Introduction to Variational Inequalities and Their Applications . Academic Press: New York, 1980.
6[6] Malitsky YV. Golden ratio algorithms for variational inequalities. 2018. https://arxiv.org /abs/1803.08832.
7[7] Konnov IV. Combined Relaxation Methods for Variational Inequalities . Springer: Berlin, 2000.
8[8] Konnov IV. Equilibrium Models and Variational Inequalities . Elsevier: Amsterdam, 2007.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Golden ratio algorithms with new stepsize rules for variational inequalities

Abstract

keywords:

1 Introduction

2 A golden ratio algorithm with diminishing stepsizes

Algorithm 2.1* (Golden Ratio Algorithm with diminishing stepsizes).*

Lemma 2.2**.**

Lemma 2.3**.**

Theorem 2.4**.**

Proof.

Remark 1*.*

3 Golden Ratio Algorithm without diminishing stepsizes

Algorithm 3.1* (Golden Ratio Algorithm without diminishing stepsizes).*

Remark 2*.*

3.1 The convergence of Algorithm 3.1

Theorem 3.2**.**

Proof.

3.2 The convergence rate of Algorithm 3.1

Theorem 3.3**.**

Proof.

4 Numerical experiments

5 Conclusions

Acknowledgement

*Algorithm 2.1** (Golden Ratio Algorithm with diminishing stepsizes).*

Lemma 2.2.

Lemma 2.3.

Theorem 2.4.

*Remark 1**.*

*Algorithm 3.1** (Golden Ratio Algorithm without diminishing stepsizes).*

*Remark 2**.*

Theorem 3.2.

Theorem 3.3.