Geometric Convergence of Distributed Gradient Play in Games with   Unconstrained Action Sets

Tatiana Tatarenko; Angelia Nedich

arXiv:1907.07144·math.OC·July 17, 2019

Geometric Convergence of Distributed Gradient Play in Games with Unconstrained Action Sets

Tatiana Tatarenko, Angelia Nedich

PDF

TL;DR

This paper introduces a simple distributed gradient algorithm that guarantees geometric convergence to Nash equilibria in non-cooperative games with unconstrained actions, outperforming previous methods in convergence speed.

Contribution

It presents a standard distributed gradient play algorithm with a single step size, providing the first geometric convergence proof for such settings and comparing favorably to prior algorithms.

Findings

01

Proves geometric convergence of the proposed algorithm.

02

Demonstrates faster convergence than the GRANE algorithm.

03

Requires only one parameter to ensure convergence.

Abstract

We provide a distributed algorithm to learn a Nash equilibrium in a class of non-cooperative games with strongly monotone mappings and unconstrained action sets. Each player has access to her own smooth local cost function and can communicate to her neighbors in some undirected graph. We consider a distributed communication-based gradient algorithm. For this procedure, we prove geometric convergence to a Nash equilibrium. In contrast to our previous works [15], [16], where the proposed algorithms required two parameters to be set up and the analysis was based on a so called augmented game mapping, the procedure in this work corresponds to a standard distributed gradient play and, thus, only one constant step size parameter needs to be chosen appropriately to guarantee fast convergence to a game solution. Moreover, we provide a rigorous comparison between the convergence rate of the…

Equations196

F (x) ≜ [\nabla_{1} J_{1} (x_{1}, x_{- 1}), \dots, \nabla_{n} J_{n} (x_{n}, x_{- n})]^{T},

F (x) ≜ [\nabla_{1} J_{1} (x_{1}, x_{- 1}), \dots, \nabla_{n} J_{n} (x_{n}, x_{- n})]^{T},

∣ \nabla_{i} J_{i} (x) - \nabla_{i} J_{i} (y) ∣

∣ \nabla_{i} J_{i} (x) - \nabla_{i} J_{i} (y) ∣

J_{i} (x_{i}, x_{- i}) = c_{i} (x_{i}) - x_{i} U_{i} (j = 1 \sum n x_{j}) .

J_{i} (x_{i}, x_{- i}) = c_{i} (x_{i}) - x_{i} U_{i} (j = 1 \sum n x_{j}) .

∥ W x - 1 \overset{x}{ˉ} ∥ \leq σ ∥ x - 1 \overset{x}{ˉ} ∥,

∥ W x - 1 \overset{x}{ˉ} ∥ \leq σ ∥ x - 1 \overset{x}{ˉ} ∥,

J_{i} (x_{i}^{*}, x_{- i}^{*}) \leq J_{i} (x_{i}, x_{- i}^{*}) .

J_{i} (x_{i}^{*}, x_{- i}^{*}) \leq J_{i} (x_{i}, x_{- i}^{*}) .

F (x^{*}) = 0 .

F (x^{*}) = 0 .

x_{(i)} = [\tilde{x}_{(i) 1}, \dots, \tilde{x}_{(i) i - 1}, x_{i}, \tilde{x}_{(i) i + 1}, \dots, \tilde{x}_{(i) n}]^{T} \in R^{n},

x_{(i)} = [\tilde{x}_{(i) 1}, \dots, \tilde{x}_{(i) i - 1}, x_{i}, \tilde{x}_{(i) i + 1}, \dots, \tilde{x}_{(i) n}]^{T} \in R^{n},

\tilde{x}_{- i} = [\tilde{x}_{(i) 1}, \dots, \tilde{x}_{(i) i - 1}, \tilde{x}_{(i) i + 1}, \dots, \tilde{x}_{(i) n}]^{T} \in R^{n - 1},

\tilde{x}_{- i} = [\tilde{x}_{(i) 1}, \dots, \tilde{x}_{(i) i - 1}, \tilde{x}_{(i) i + 1}, \dots, \tilde{x}_{(i) n}]^{T} \in R^{n - 1},

\tilde{x}_{(:) j} = [\tilde{x}_{(1) j}, \dots, \tilde{x}_{(j - 1) j}, x_{j}, \tilde{x}_{(j + 1) j}, \dots, \tilde{x}_{(n) j}]^{T} \in R^{n} .

\tilde{x}_{(:) j} = [\tilde{x}_{(1) j}, \dots, \tilde{x}_{(j - 1) j}, x_{j}, \tilde{x}_{(j + 1) j}, \dots, \tilde{x}_{(n) j}]^{T} \in R^{n} .

{\mathbf{x}}\triangleq\left(\begin{array}[]{ccc}\textrm{---}&x_{(1)}^{\mathrm{T}}&\textrm{---}\\ \textrm{---}&x_{(2)}^{\mathrm{T}}&\textrm{---}\\ &\vdots&\\ \textrm{---}&x_{(n)}^{\mathrm{T}}&\textrm{---}\\ \end{array}\right).

{\mathbf{x}}\triangleq\left(\begin{array}[]{ccc}\textrm{---}&x_{(1)}^{\mathrm{T}}&\textrm{---}\\ \textrm{---}&x_{(2)}^{\mathrm{T}}&\textrm{---}\\ &\vdots&\\ \textrm{---}&x_{(n)}^{\mathrm{T}}&\textrm{---}\\ \end{array}\right).

\tilde{F} (x) ≜ Diag (\nabla_{1} J_{1} (x_{(1)}), \dots, \nabla_{n} J_{n} (x_{(n)})) .

\tilde{F} (x) ≜ Diag (\nabla_{1} J_{1} (x_{(1)}), \dots, \nabla_{n} J_{n} (x_{(n)})) .

x_{i}^{t + 1} = j = 1 \sum n w_{ij} x_{(j) i}^{t} - α \nabla_{x_{i}} J_{i} (x_{(i)}^{t}),

x_{i}^{t + 1} = j = 1 \sum n w_{ij} x_{(j) i}^{t} - α \nabla_{x_{i}} J_{i} (x_{(i)}^{t}),

x_{(i) l}^{t + 1} = j = 1 \sum n w_{ij} x_{(j) l}^{t}, \mbox f or l \neq = i, i \in [n] .

x^{t + 1} = W x^{t} - α \tilde{F} (x^{t}),

x^{t + 1} = W x^{t} - α \tilde{F} (x^{t}),

0 < α < min {1, \frac{μ}{2 L ^{2}}, \frac{σ}{2 L} \frac{n}{n - 1} (\frac{2}{1 + σ ^{2}} - 1), \frac{n}{μ} (\frac{8}{( 1 + σ ^{2} - 2 ) ^{2}} - 1), \frac{n ^{2} + \frac{2 μ ^{4} ( 1 - σ ^{2} )}{( n - 1 ) L ^{4} ( 1 + σ ^{2} )} - n}{2 μ} ⎭ ⎬ ⎫,

0 < α < min {1, \frac{μ}{2 L ^{2}}, \frac{σ}{2 L} \frac{n}{n - 1} (\frac{2}{1 + σ ^{2}} - 1), \frac{n}{μ} (\frac{8}{( 1 + σ ^{2} - 2 ) ^{2}} - 1), \frac{n ^{2} + \frac{2 μ ^{4} ( 1 - σ ^{2} )}{( n - 1 ) L ^{4} ( 1 + σ ^{2} )} - n}{2 μ} ⎭ ⎬ ⎫,

∥ x^{t} - x^{*} ∥_{Fro}^{2} \leq O (q^{t})

∥ x^{t} - x^{*} ∥_{Fro}^{2} \leq O (q^{t})

\overset{ˉ}{x}^{t + 1} = \overset{ˉ}{x}^{t} - \frac{α}{n} F^{0} (x^{t}),

\overset{ˉ}{x}^{t + 1} = \overset{ˉ}{x}^{t} - \frac{α}{n} F^{0} (x^{t}),

∥ x^{t + 1} - \overset{ˉ}{x}^{t + 1} ∥_{Fro} \leq σ ∥ x^{t} - \overset{ˉ}{x}^{t} ∥_{Fro} + α \frac{n - 1}{n} ∥ \tilde{F} (x^{t}) ∥_{Fro} .

∥ x^{t + 1} - \overset{ˉ}{x}^{t + 1} ∥_{Fro} \leq σ ∥ x^{t} - \overset{ˉ}{x}^{t} ∥_{Fro} + α \frac{n - 1}{n} ∥ \tilde{F} (x^{t}) ∥_{Fro} .

∥ x^{t + 1}

∥ x^{t + 1}

= ∥ W x^{t} - α \tilde{F} (x^{t}) - \overset{ˉ}{x}^{t} + \frac{α}{n} F^{0} (x^{t}) ∥_{Fro}

\leq σ ∥ x^{t} - \overset{ˉ}{x}^{t} ∥_{Fro} + \frac{α}{n} ∥ n \tilde{F} (x^{t}) - F^{0} (x^{t}) ∥_{Fro} .

n \tilde{F} (x^{t}) - F^{0} (x^{t})

n \tilde{F} (x^{t}) - F^{0} (x^{t})

= (n - 1) \nabla_{1} J_{1} (x_{(1)}^{t}) - \nabla_{1} J_{1} (x_{(1)}^{t}) ⋮ - \nabla_{1} J_{1} (x_{(1)}^{t}) \dots \dots ⋱ \dots - \nabla_{n} J_{n} (x_{(n)}^{t}) - \nabla_{n} J_{n} (x_{(n)}^{t}) ⋮ - (n - 1) \nabla_{n} J_{n} (x_{(n)}^{t})

∥ n \tilde{F} (x^{t}) - F^{0} (x^{t}) ∥_{Fro}

∥ n \tilde{F} (x^{t}) - F^{0} (x^{t}) ∥_{Fro}

= n (n - 1) ∥ \tilde{F} (x^{t}) ∥_{Fro},

∥ \tilde{F} (x^{t}) ∥_{Fro} \leq L ∥ x^{t} - x^{*} ∥_{Fro},

∥ \tilde{F} (x^{t}) ∥_{Fro} \leq L ∥ x^{t} - x^{*} ∥_{Fro},

∥ \tilde{F} (x^{t}) ∥_{Fro}

∥ \tilde{F} (x^{t}) ∥_{Fro}

= i = 1 \sum n (\nabla_{i} J_{i} (x_{(i)}^{t}) - \nabla_{i} J_{i} (x^{*}))^{2}

= i = 1 \sum n L_{i}^{2} ∥ x_{(i)}^{t} - x^{*} ∥^{2} \leq L i = 1 \sum n ∥ x_{(i)}^{t} - x^{*} ∥^{2}

= L ∥ x^{t} - x^{*} ∥_{Fro} .

(1 + \frac{2 α}{n} (μ - \frac{θ}{2}))

(1 + \frac{2 α}{n} (μ - \frac{θ}{2}))

\leq ∥ \overset{ˉ}{x}^{t} - x^{*} ∥_{Fro}^{2} + \frac{L ^{2} α}{θ} ∥ x^{t} - \overset{ˉ}{x}^{t} ∥_{Fro}^{2} .

∥ \overset{x}{ˉ}^{t + 1} - x^{*} ∥^{2}

∥ \overset{x}{ˉ}^{t + 1} - x^{*} ∥^{2}

- ∥ \overset{x}{ˉ}^{t} - \overset{x}{ˉ}^{t + 1} ∥^{2}

= ∥ \overset{x}{ˉ}^{t} - x^{*} ∥^{2} - \frac{2 α}{n} ⟨ \overset{x}{ˉ}^{t + 1} - x^{*}, \hat{F} (x^{t})⟩

- ∥ \overset{x}{ˉ}^{t} - \overset{x}{ˉ}^{t + 1} ∥^{2} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Geometric Convergence of Distributed Gradient Play in Games with Unconstrained Action Sets

Tatiana Tatarenko and Angelia Nedić Tatiana Tatarenko is with the Control Methods and Robotics Lab, TU darmstadt, Germany.Angelia Nedić is with School of Electrical, Computer and Energy Engineering, Arizona State University, USA. The work has been partially supported by Office of Naval Research grant no. N00014-12-1-0998.

Abstract

We provide a distributed algorithm to learn a Nash equilibrium in a class of non-cooperative games with strongly monotone mappings and unconstrained action sets. Each player has access to her own smooth local cost function and can communicate to her neighbors in some undirected graph. We consider a distributed communication-based gradient algorithm. For this procedure, we prove geometric convergence to a Nash equilibrium. In contrast to our previous works [16, 15], where the proposed algorithms required two parameters to be set up and the analysis was based on a so called augmented game mapping, the procedure in this work corresponds to a standard distributed gradient play and, thus, only one constant step size parameter needs to be chosen appropriately to guarantee fast convergence to a game solution. Moreover, we provide a rigorous comparison between the convergence rate of the proposed distributed gradient play and the rate of the GRANE algorithm presented in [15]. It allows us to demonstrate that the distributed gradient play outperforms the GRANE in terms of convergence speed.

I Introduction

There are a lot of multi-agent systems, where agents’ objective functions are coupled through decision variables of all agents in a system. In such cases, game theory is a useful tool to deal with the corresponding optimization problems. The applications of game theory in engineering can be found, for example, in electricity markets, power systems, flow control problems, and communication networks [1, 9, 12]. Desirable outcomes in games are characterized by so called Nash equilibria, which correspond to a stable state from which no agent has motivation to deviate. This paper provides a distributed discrete-time algorithm applicable to fast Nash equilibrium seeking in a class of non-cooperative games under the assumption that agents exchange their local information with neighbors by means of some communication topology.

Distributed communication-based algorithms are proposed for aggregative games [3, 6]. Communication protocols are applied to different classes of games with some convergence guarantees [10, 11, 14]. The work [10] proposes a gradient based gossip algorithm to learn Nash equilibria in games. Under some technical assumptions, this algorithm converges almost surely to the Nash equilibrium, given a diminishing step size. Under further assumption of strong convexity, with some constant step size $\alpha$ , the algorithm converges to an $O(\alpha)$ neighborhood of the Nash equilibrium in average. The work [11] develops an algorithm within the framework of inexact-ADMM and proves its convergence to the Nash equilibrium with the rate $o(1/k)$ under cocoercivity of the game mapping. However, no aforementioned work aims to provide algorithms which converge to a Nash equilibrium with a fast geometric rate.

The paper [16] leverages the idea of an accelerated approach for solving variational inequalities [4] and provides a version of the gradient play algorithm (Acc-GRANE) that guaratees a fast convergence to the Nash equilibrium with an explicitly good dependence on the condition number. The analysis is based on strong monotone properties of a so called augmented mapping which takes into account not only the gradients of the cost functions, but also the communication settings. The presented algorithm is applicable only to a sub-class of games characterized by a restrictive connection between the number of players, Lipschitz constant, and parameter of strong monotonicity. To apply the distributed gradient play algorithm to a broader class of games, work [15] considers the case of the restricted strongly monotone augmented game mapping and demonstrates geometric convergence of the procedure to the Nash equilibrium. However, both types of the procedures mentioned above require a careful set up not only for the step size parameter but also for the augmented mapping. In this paper we provide a distributed gradient play whose convergence properties are not based on the augmented mapping. This fact allows us to focus only on the choice of the step size in the optimization procedure. Moreover, a rigorous comparison between the convergence rates of the proposed distributed gradient play and the GRANE in [15] demonstrates that the algorithm presented in this paper converges faster to a Nash equilibrium under the same settings of a game.

This paper is organized as follows. In Section II, we set up the game under consideration. In Section III, we present the distributed gradient play algorithm to seek its solutions. In Section IV, we prove the main result stating a geometric convergence of the proposed procedure. Section V compares the convergence rates of the proposed algorithm and the GRANE from work [15]. We provide a numerical case study in Section VI. In Section VII, we summarize the result and discuss future work.

Notations. The set $\{1,\ldots,n\}$ is denoted by $[n]$ . For any function $f:K\to{\mathbb{R}}$ , $K\subseteq{\mathbb{R}}^{n}$ , $\nabla_{i}f(x)=\frac{\partial f(x)}{\partial x_{i}}$ is the partial derivative taken in respect to the $i$ th coordinate of the vector variable $x\in{\mathbb{R}}^{n}$ . For any real vector space $\tilde{E}$ its dual space is denoted by $\tilde{E}^{*}$ and the inner product is denoted by $\langle u,v\rangle$ , $u\in\tilde{E}^{*}$ , $v\in\tilde{E}$ . An operator $B:\tilde{E}\to\tilde{E}^{*}$ is positive definite if $\langle Bv,v\rangle>0$ for all $v\in\tilde{E}\setminus\{0\}$ . An operator $B:\tilde{E}\to\tilde{E}^{*}$ is self-adjoint if $\langle Bv,v^{\prime}\rangle=\langle Bv^{\prime},v\rangle$ for all $v^{\prime},v\in\tilde{E}$ . Given a positive definite and self-adjoint operator $B$ , we define the Euclidean norm on $\tilde{E}$ induced by $B$ as $\|v\|=\langle Bv,v\rangle^{1/2}$ . Any mapping $g:\tilde{E}\to\tilde{E}^{*}$ is said to be strongly monotone with the constant $\mu>0$ on $Q\subseteq\tilde{E}$ , if $\langle g(u)-g(v),u-v\rangle\geq\mu\|u-v\|^{2}$ for any $u,v\in Q$ , where $\|\cdot\|$ is the corresponding norm in $\tilde{E}$ . We consider real vector space $E$ , which is either space of real vectors $E=E^{*}={\mathbb{R}}^{n}$ or the space of real matrices $E=E^{*}={\mathbb{R}}^{n\times n}$ . In the case $E={\mathbb{R}}^{n\times n}$ the inner product $\langle u,v\rangle\triangleq\sqrt{{\mathrm{trace}}(u^{T}v)}$ is the Frobenius inner product on ${\mathbb{R}}^{n\times n}$ . In the case $E={\mathbb{R}}^{n}$ we use $\|\cdot\|$ to denote the Euclidean norm induced by the standard dot product in ${\mathbb{R}}^{n}$ , whereas in the case $E={\mathbb{R}}^{n\times n}$ we use $\|\cdot\|_{{\mathrm{Fro}}}$ to denote the Frobenius norm induced by the Frobenius inner product i.e. $\|v\|_{{\mathrm{Fro}}}\triangleq\sqrt{{\mathrm{trace}}(v^{T}v)}$ . The largest singular value of a matrix $A$ is denoted by ${\sigma_{\max}\{A\}}$ . The smallest nonzero eigenvalue of a positive semidefinite matrix $A\not=0$ is denoted by ${\tilde{\lambda}_{\min}\{A\}}$ , which is strictly positive. For any matrix $A\in{\mathbb{R}}^{n\times n}$ we use ${\mathrm{diag}}(A)$ to denote its diagonal vector, i.e. ${\mathrm{diag}}(A)=(a_{11},\ldots,a_{nn})$ . For any vector $a\in{\mathbb{R}}^{n}$ we use ${\mathrm{Diag}}(a)$ to denote the diagonal matrix with the vector $a$ on its diagonal. We call a matrix $A$ consensual, if it has equal row vectors.

II Problem Formulation

We consider a non-cooperative game between $n$ players with unconstrained action sets. Let $J_{i}$ and ${\Omega}_{i}={\mathbb{R}}$ denote respectively the cost function and the feasible action set of the player $i$ 111All results below are applicable for games with different dimensions $\{d_{i}\}$ of the action sets, i.e., $\Omega_{i}=R^{d_{i}}$ for all $i$ . The one-dimensional case is considered for the sake of notation simplicity.. Each function $J_{i}(x_{i},x_{-i})$ , $i\in[n]$ , depends on $x_{i}$ and $x_{-i}$ , where $x_{i}\in{\mathbb{R}}$ is the action of the player $i$ and $x_{-i}\in{\Omega}_{-i}={\mathbb{R}}^{n-1}$ denotes the joint action of all players except for the player $i$ . Overall in this paper, we assume that the cost function $J_{i}(x_{i},x_{-i})$ is continuously differentiable in $x_{i}$ for each fixed $x_{-i}$ , $i\in[n]$ . Then we define the game mapping as

[TABLE]

where $\nabla_{i}J_{i}(x_{i},x_{-i})=\frac{\partial J_{i}(x_{i},x_{-i})}{\partial x_{i}}$ for all $i\in[n]$ . We assume that the players can interact over an undirected communication graph ${\mathcal{G}}([n],{\mathcal{A}})$ . The set of nodes is the set of the player $[n]$ and the set of undirected arcs ${\mathcal{A}}$ is such that $(i,j)\in{\mathcal{A}}$ if and only if $(j,i)\in{\mathcal{A}}$ , i.e. there is an undirected communication link between $i$ to $j$ . Thus, some information (message) can be passed from the player $i$ to the player $j$ and vice versa. For each player $i$ the set ${\mathcal{N}}_{i}$ is the set of neighbors in the graph ${\mathcal{G}}([n],{\mathcal{A}})$ , namely ${\mathcal{N}}_{i}\triangleq\{j\in[n]:\,(i,j)\in{\mathcal{A}}\}$ . Let us denote the game introduced above by $\Gamma(n,\{J_{i}\},\{{\Omega}_{i}={\mathbb{R}}\},{\mathcal{G}})$ . We make the following assumptions regarding the game $\Gamma$ .

Assumption 1.

The game mapping ${\mathbf{F}}(x)$ is strongly monotone on ${\mathbb{R}}^{n}$ with the constant $\mu>0$ .

Note that Assumption 1 above implies strong convexity of each cost function $J_{i}(x_{i},x_{-i})$ in $x_{i}$ for any fixed $x_{-i}$ with the constant $\mu$ .

Assumption 2.

Each function $\nabla_{i}J_{i}(\cdot):{\mathbb{R}}^{n}\to{\mathbb{R}}$ , $i\in[n]$ , is Lipschitz continuous on ${\mathbb{R}}^{n}$ , namely for some constant $L_{i}\geq 0$ , we have $\forall\ x,y\in{\mathbb{R}}^{n}$

[TABLE]

Remark 1.

An example of games satisfying Assumption 2 above is a class of aggregative games [3, 6], where each cost function $J_{i}$ is of the following form:

[TABLE]

Here $c_{i}(\cdot):{\mathbb{R}}\to{\mathbb{R}}$ is an agent specific function with a Lipschitz continuous derivative and the linear function $U_{i}(\sum_{j=1}^{n}x_{j})$ captures the utility associated with aggregate output $\sum_{j=1}^{n}x_{j}$ .

Two assumptions above are standard for works aiming to demonstrate geometric convergence of algorithms for computing an equilibrium point in variational inequalities and games.

Finally, we make the following assumption on the communication graph, which guarantees sufficient information ”mixing” in the network.

Assumption 3.

The underlying undirected communication graph ${\mathcal{G}}([n],{\mathcal{A}})$ is connected. There is a non-negative matrix $W=[w_{ij}]\in{\mathbb{R}}^{n\times n}$ associated with the graph such that $w_{ij}>0$ if and only if $(i,j)\in{\mathcal{A}}$ . Moreover, $W$ is doubly stochastic, i.e. $\sum_{l=1}^{n}w_{lj}=\sum_{l=1}^{n}w_{il}=1$ , $\forall i,j\in[n]$ .

Remark 2.

The weight matrix $W$ from Assumption 3 need not be symmetric. Some simple strategies that generate symmetric mixing matrices for which Assumption 3 holds can be found in Section 2.4 in [13].

Assumption 3 implies that the second largest singular value $\sigma$ of $W$ is such that $\sigma\in(0,1)$ and for any $x\in{\mathbb{R}}^{n}$ the following average property holds (see [5]):

[TABLE]

where $\bar{x}=\frac{1}{n}{\mathbf{1}}^{T}{x}$ is the average of the coordinates of $x$ .

One of the stable solutions in any game $\Gamma$ corresponds to a Nash equilibrium defined below.

Definition 1.

A vector $x^{*}=[x_{1}^{*},x_{2}^{*},\cdots,x_{n}^{*}]^{T}\in{\Omega}$ is a Nash equilibrium if for any $i\in[n]$ and $x_{i}\in{\Omega}_{i}$

[TABLE]

In this work, we are interested in distributed seeking of a Nash equilibrum in a game $\Gamma(n,\{J_{i}\},\{{\Omega}_{i}={\mathbb{R}}\},{\mathcal{G}})$ for which Assumptions 1-3 hold. Note that under Assumption 1, the game $\Gamma(n,\{J_{i}\},\{{\Omega}_{i}={\mathbb{R}}\},{\mathcal{G}})$ has a unique Nash equilibrium [8]. Moreover, as ${\Omega}_{i}={\mathbb{R}}$ and $J_{i}(x_{i},x_{-i})$ is strongly monotone in $x_{i}$ over ${\mathbb{R}}$ for all $i\in[n]$ the vector $x^{*}\in{\mathbb{R}}^{n}$ is the unique Nash equilibrium if and only if

[TABLE]

III Nash Equilibria Learning in Distributed Settings

To deal with the partial information available to players which is exchanged among them over the communication graph, we assume that each player $i$ maintains a local variable

[TABLE]

which is her estimation of the joint action $x=[x_{1},x_{2},\cdots,x_{n}]^{T}$ . Here ${\tilde{x}}_{(i)j}\in{\mathbb{R}}$ is the player $i$ ’s estimate of $x_{j}$ and ${\tilde{x}}_{(i)i}=x_{i}\in{\Omega}_{i}={\mathbb{R}}$ . Also, we compactly denote the estimates of other players’ actions by the player $i$ as

[TABLE]

and the estimates of the player $j$ ’s action $x_{j}$ by all players as

[TABLE]

Thus, we can define the estimation matrix ${\mathbf{x}}\in{\mathbb{R}}^{n\times n}$ , where the $i$ th row is equal to the estimation vector $x_{(i)}$ , $i\in[n]$ , namely

[TABLE]

For any given estimation matrix, we define the diagonal matrix ${\tilde{\mathbf{F}}}({\mathbf{x}})\in{\mathbb{R}}^{n\times n}$ with ${\tilde{\mathbf{F}}}({\mathbf{x}})_{ii}=\nabla_{i}J_{i}(x_{(i)})$ , $i\in[n]$ , namely

[TABLE]

We propose the following distributed gradient play procedure for learning a Nash equilibrium in the game $\Gamma(n,\{J_{i}\},\{{\Omega}_{i}={\mathbb{R}}\},{\mathcal{G}})$ . According to this algorithm, each player $i$ updates its local estimation of the joint action as follows:

[TABLE]

Thus, to get a new estimation $x_{(i)}^{t+1}$ each agent $i$ aggregates over the communication graph the current estimations of its neighbors and, makes a local gradient step with a step size $\alpha$ in respect to the gradient of its cost function $\nabla_{x_{i}}J_{i}(x_{(i)}^{t})$ calculated at the current local estimation $x_{(i)}^{t}$ . The local updates above can be represented in the following compact vector form:

[TABLE]

where $\alpha$ is a constant step size to be set up.

In the following we prove geometric convergence of the procedure above to the unique Nash equilibrium in the game $\Gamma(n,\{J_{i}\},\{{\Omega}_{i}={\mathbb{R}}\},{\mathcal{G}})$ under Assumptions 1-3 and an appropriate choice of $\alpha$ . This result is formulated in the following theorem.

Theorem 1.

Let $\Gamma(n,\{J_{i}\},\{{\Omega}_{i}={\mathbb{R}}\},{\mathcal{G}})$ be a game for which Assumptions 1-3 hold. Let $\mu$ and $\sigma$ be as defined in Assumption 1 and relation (2), respectively, and the step size parameter $\alpha$ be chosen as follows:

[TABLE]

where $L$ is the Lipschitz constant defined in Assumption 2. Then, the algorithm (7) converges to the consensual matrix ${\mathbf{x}}^{*}$ whose rows are equal to the row-vector $x^{*}$ which is the unique Nash equilibrium in the game $\Gamma(n,\{J_{i}\},\{{\Omega}_{i}={\mathbb{R}}\},{\mathcal{G}})$ . Moreover,

[TABLE]

for some $q=q(\alpha)\in(0,1)$ .

In the next section we provide the proof of the main result formulated in Theorem 1 above.

IV Proof of Main Result

Let ${\bar{\mathbf{x}}}^{t}$ be a consensual matrix with the rows equal to the vector $\bar{x}^{t}=\frac{1}{n}\sum_{i=1}^{n}x^{t}_{(i)}$ . This matrix corresponds to the running average of the players’ estimations of the current joint action. Under Assumption 3, as seen from the definition of the algorithm in (7), the average $\bar{x}^{t}$ evolves according to the following relation:

[TABLE]

where ${\mathbf{F}}^{0}(\cdot)$ is the consensual matrix with the rows equal to ${\mathrm{diag}}({\tilde{\mathbf{F}}}(\cdot))$ .

To prove Theorem 1, we start by proving some lemmata. First of all, we estimate the consensus violation term in respect to the running average, namely $\|{\mathbf{x}}^{t+1}-{\bar{\mathbf{x}}}^{t+1}\|_{{\mathrm{Fro}}}$ .

Lemma 1.

Under Assumption 3 the following holds for the procedure (7):

[TABLE]

Proof.

Taking into account (2) and (8), we conclude that

[TABLE]

Next, note that

[TABLE]

and, hence,

[TABLE]

thus yielding the stated result.

Next, to estimate the distance $\|{\mathbf{x}}^{t+1}-{\bar{\mathbf{x}}}^{t+1}\|_{{\mathrm{Fro}}}$ in terms of optimum violation $\|{\mathbf{x}}^{t}-{\mathbf{x}}^{*}\|_{{\mathrm{Fro}}}$ , we upper bound $\|{\tilde{\mathbf{F}}}({\mathbf{x}}^{t})\|_{{\mathrm{Fro}}}$ in the following lemma.

Lemma 2.

Let Assumption 2 hold in the game $\Gamma(n,\{J_{i}\},\{{\Omega}_{i}={\mathbb{R}}\},{\mathcal{G}})$ . Then

[TABLE]

where $L=\max_{i}L_{i}$ and $L_{i}$ , $i\in[n]$ , are the Lipschitz constants from Assumption 2.

Proof.

Due to Assumption 2 and the fact that ${\mathbf{F}}({\mathbf{x}}^{*})=\bm{0}$ (see (3)),

[TABLE]

Finally, we analyze the distance between the running average and the Nash equilibrium.

Lemma 3.

Let Assumptions 1-3 hold in the game $\Gamma(n,\{J_{i}\},\{{\Omega}_{i}={\mathbb{R}}\},{\mathcal{G}})$ . Then for any $\theta>0$ and the step size $\alpha\leq\frac{\theta}{L^{2}}$ the following inequality holds:

[TABLE]

Proof.

Let $\hat{{\mathbf{F}}}({\mathbf{x}}^{t})=(\nabla_{1}J_{1}(x_{(1)}^{t}),\ldots,\nabla_{n}J_{n}(x_{(n)}^{t}))^{T}\in{\mathbb{R}}^{n}$ . Using the equality $\bar{x}^{t+1}=\bar{x}^{t}-\frac{\alpha}{n}\hat{{\mathbf{F}}}({\mathbf{x}}^{t})$ (see (8)) and the basic inequality $\|a\|^{2}=\|a+b\|^{2}-2\langle a,b\rangle-\|b\|^{2}$ for $a=\bar{x}^{t+1}-x^{*}$ and $b=\bar{x}^{t}-\bar{x}^{t+1}$ , we obtain

[TABLE]

We proceed with estimating the term $\langle\bar{x}^{t+1}-x^{*},\hat{{\mathbf{F}}}({\mathbf{x}}^{t})\rangle$ .

[TABLE]

where in the first equality we used the fact that ${\mathbf{F}}(x^{*})=\bm{0}$ (see (3)) and in the last inequality we used Assumption 1. Next, for any $\theta>0$ we obtain

[TABLE]

Bringing (13) and (17) into (9) we conclude that

[TABLE]

Further, taking into account that $\sum_{i=1}^{n}\|x_{(i)}^{t}-\bar{x}^{t+1}\|^{2}=\sum_{i=1}^{n}\|x_{(i)}^{t}-\bar{x}^{t}\|^{2}+n\|\bar{x}^{t}-\bar{x}^{t+1}\|^{2}$ , we see that

[TABLE]

Next, taking into that $\alpha<\frac{\theta}{L^{2}}$ , $\sum_{i=1}^{n}\|x_{(i)}^{t}-\bar{x}^{t}\|^{2}=\|{\mathbf{x}}^{t}-{\bar{\mathbf{x}}}^{t}\|^{2}_{{\mathrm{Fro}}}$ , and that for any consensual matrices ${\mathbf{x}}\in{\mathbb{R}}^{n\times n}$ , ${\mathbf{y}}\in{\mathbb{R}}^{n\times n}$ with the vectors $x,y\in{\mathbb{R}}^{n}$ as their rows respectively we have $\|x-y\|^{2}=\frac{1}{n}\|{\mathbf{x}}-{\mathbf{y}}\|^{2}_{{\mathrm{Fro}}}$ , we get from (21)

[TABLE]

Having these three lemmata in place, we are ready to prove the main result.

Proof of Theorem 1.

Taking into account Lemma 1 and Lemma 2, we conclude that, under conditions of the theorem, we have

[TABLE]

The inequalities above together with

[TABLE]

imply that

[TABLE]

Next, we apply to (23) the standard inequality $(a+b)^{2}\leq(1+\beta)a^{2}+\frac{1+\beta}{\beta}b^{2}$ , which holds for any real numbers $a,b\in{\mathbb{R}}$ and any $\beta>0$ . By taking $a=(\sigma+\alpha\frac{\sqrt{n-1}}{\sqrt{n}}L)\|{\mathbf{x}}^{t}-{\bar{\mathbf{x}}}^{t}\|_{{\mathrm{Fro}}}$ and $b=\alpha\frac{\sqrt{n-1}}{\sqrt{n}}L\|{\bar{\mathbf{x}}}^{t}-{\mathbf{x}}^{*}\|_{{\mathrm{Fro}}}$ , we get

[TABLE]

Moreover, Lemma 3 with the choice $\theta=\mu$ implies

[TABLE]

where $\gamma=\frac{1}{1+\frac{\mu\alpha}{n}}$ . Let ${\mathbf{z}}^{t}=(\|{\bar{\mathbf{x}}}^{t}-{\mathbf{x}}^{*}\|^{2}_{{\mathrm{Fro}}},\|{\mathbf{x}}^{t}-{\bar{\mathbf{x}}}^{t}\|^{2}_{{\mathrm{Fro}}})$ . Then taking (27) and (31) into account, we conclude that

[TABLE]

where

[TABLE]

We proceed with analysis of the properties of the positive matrix $Z=Z(\alpha,\mu,\beta)$ . First, we calculate its eigenvalues. Its characteristic polynomial is

[TABLE]

We need to solve $p_{Z}(\lambda)=0$ , namely

[TABLE]

This results in

[TABLE]

where

[TABLE]

We let $\lambda_{1}$ denote the largest (positive) eigenvalue. Borrowing the idea from the proof of Lemma 7 and Lemma 17 in [7], we notice that, due to properties of the diagonalization, any element of the matrix $Z^{t}$ is in the form $z^{ij}_{t}=z_{t}=c_{1}\lambda_{1}^{t}+c_{2}\lambda_{2}^{t}$ , $i,j=1,2$ , for some $c_{1},c_{2}\in\mathbb{C}$ (we omit the element upper index in $z_{t}^{ij}$ to simplify notations). To estimate $c_{1},c_{2}$ for each element, we construct the following system of linear equalities:

[TABLE]

where $z_{0}$ and $z_{1}$ are the corresponding elements of the matrices $Z^{0}$ and $Z$ respectively. The solution of the system above is

[TABLE]

Thus,

[TABLE]

where in the last two inequalities we used Perron-Frobenius Theorem for positive matrices, namely $\lambda_{1}>|\lambda_{2}|$ (see Theorem 8.2.11 in [2]). As

[TABLE]

and by taking into account (32) and (39), we conclude that

[TABLE]

In (42) $z_{1}^{ij}$ is the $ij$ th element of the matrix $Z^{1}=Z$ . Thus, to get the result it suffices to demonstrate that $\lambda_{1}<1$ . Recall that

[TABLE]

where

[TABLE]

Let us now fix $\beta=\frac{1}{2}\left(\frac{1}{\sigma^{2}}-1\right)$ . As

[TABLE]

we conclude that

[TABLE]

As $\alpha<\frac{n}{\mu}\left(\frac{8}{(\sqrt{1+\sigma^{2}}-\sqrt{2})^{2}}-1\right)=\frac{n}{\mu}\left(\frac{4}{(\sigma\sqrt{1+\beta}-1)^{2}}-1\right)$ , we conclude that

[TABLE]

Bringing (47) and (50) together, we obtain

[TABLE]

and, thus, from (46)

[TABLE]

Next, taking into account that $\alpha<\frac{\sqrt{n^{2}+\frac{2\mu^{4}(1-\sigma^{2})}{(n-1)L^{4}(1+\sigma^{2})}}-n}{2\mu}$ and (51), we conclude that222More details can be found in Appendix.

[TABLE]

Finally, according to (42),

[TABLE]

where

[TABLE]

where

[TABLE]

V Comparison with the convergence rate of the GRANE

In this section we compare the convergence rate of the algorithm (7) analyzed in this paper and the convergence rate of the GRANE procedure studied in [15], given some large number of players $n$ .

According to Theorem 9 in [15] under Assumptions 1-3 made above, the GRANE converges to the Nash equilibrium with the rate $O\left(\left(1-\frac{1}{\gamma_{r}^{2}}\right)^{t}\right)$ , where $\gamma_{r}=\frac{L_{{\mathbf{F}}_{a}}}{\mu_{r,{\mathbf{F}}_{a}}}>1$ and the constants $L_{{\mathbf{F}}_{a}},\mu_{r,{\mathbf{F}}_{a}}$ are defined in Lemmas 1 and 3 respectively333The constant $L_{{\mathbf{F}}}=\max_{i}\{\sqrt{L_{i}^{2}+L_{-i}^{2}}\}$ defined in Lemma 3 in [15] corresponds to the constant $L=\max_{i}L_{i}$ , where $L_{i}$ s are defined in Assumption 2 in this paper.. After substituting the expressions for $L_{{\mathbf{F}}_{a}},\mu_{r,{\mathbf{F}}_{a}}$ into $\gamma_{r}$ , we conclude that for a sufficiently large $n$

[TABLE]

Next, according to Remark 4 in [15],

[TABLE]

Thus, given the optimal choice of $\alpha^{0}$ , we get

[TABLE]

Thus, the convergence rate of the GRANE is

[TABLE]

Now we proceed with the convergence rate estimation of the algorithm (7). According to the proof of Theorem 1, the convergence rate of the distributed procedure is $O(q(\alpha)^{t})$ , where (see (51))

[TABLE]

The constant $q$ above is less than $1$ , if

[TABLE]

Thus, taking into account two inequalities above, we conclude that for a sufficiently large $n$

[TABLE]

Next, let us notice that, under Assumption 2, the game mapping ${\mathbf{F}}$ defined in (1) is Lipschitz continuous with the constant $L^{{\mathbf{F}}}=L\sqrt{n}$ . Indeed, due to Assumption 2,

[TABLE]

Thus, the condition number of the mapping ${\mathbf{F}}$ is

[TABLE]

By comparing (55) and (56) and taking into account (59), we conclude that the convergence rate of the proposed algorithm (7) is faster than that of the GRANE presented in [15].

VI Simulation

Let us consider a class of games with strongly monotone game mappings. Specifically, we have players $\{1,2,\ldots,20\}$ and each player $i$ ’s objective is to minimize the cost function $J_{i}(x_{i},x_{-i})=f_{i}(x_{i})+l_{i}(x_{-i})x_{i}$ , where $f_{i}(x_{i})=0.5a_{i}x_{i}^{2}+b_{i}x_{i}$ and $l_{i}(x_{-i})=\sum_{j\neq i}c_{ij}x_{j}$ . The local cost function is in general dependent on actions of all players, but the underlying communication graph is a randomly generated tree graph. We randomly select $a_{i}$ , $b_{i}$ , and $c_{ij}$ for all possible $i$ and $j$ .

We simulate the proposed gradient play algorithm and compare its implementation with the implementation of the algorithm GRANE presented in [15] (see Figure 1). The GRANE is based on a so called augmented game mapping, for which an additional parameter has to be chosen to guarantee specific properties of this mapping and, thus, convergence of the procedure. Note that the GRANE is very sensitive to the setting of this parameter. We chose this parameter based on the theoretic results in [15]. For the gradient play we chose the step size parameter $\alpha$ based on Theorem 1 (in the presented simulation $\alpha=0.05$ ). As we can see, the gradient play outperforms the GRANE.

VII conclusion

In this paper, we have presented the distributed gradient play which provably converges to a Nash equilibrium in strongly convex games with unconstrained action sets. In comparison to the GRANE algorithm [15], which possesses a geometric convergence rate as well, the proposed algorithm requires only one parameter (the step size) to be appropriately choose. Moreover, its convergence rate is shown to be faster given some fixed game under consideration. The future work can be devoted to investigation of the convergence rate of the gradient projected play in the case of bounded closed agents’ action sets.

Bibliography16

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] T. Alpcan and T. Başar. Distributed Algorithms for Nash Equilibria of Flow Control Games. In Advances in Dynamic Games , pages 473–498. Springer, 2005.
2[2] Roger A. Horn and Charles R. Johnson. Matrix Analysis . Cambridge University Press, New York, NY, USA, 2nd edition, 2012.
3[3] J. Koshal, A. Nedić, and U. Shanbhag. A Gossip Algorithm for Aggregative Games on Graphs. In Decision and Control (CDC), 2012 IEEE 51st Annual Conference on , pages 4840–4845. IEEE, 2012.
4[4] Yu. Nesterov and L. Scrimali. Solving strongly monotone variational and quasi-variational inequalities. Discrete and Continuous Dynamical Systems - A , 31(4):1383–1396.
5[5] A. Olshevsky and J. Tsitsiklis. Convergence speed in distributed consensus and averaging. SIAM Journal on Control and Optimization , 48(1):33–55, 2009.
6[6] D. Paccagnan, M. Kamgarpour, and J. Lygeros. On aggregative and mean field games with applications to electricity markets. In 2016 European Control Conference (ECC) , pages 196–201, June 2016.
7[7] G. Qu and N. Li. Accelerated distributed nesterov gradient descent. Co RR , (https://arxiv.org/abs/1705.07176), 2018.
8[8] J. B. Rosen. Existence and uniqueness of equilibrium points for concave n-person games. Econometrica , 33(3):520–534, 1965.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Geometric Convergence of Distributed Gradient Play in Games with Unconstrained Action Sets

Abstract

I Introduction

II Problem Formulation

Assumption 1**.**

Assumption 2**.**

Remark 1**.**

Assumption 3**.**

Remark 2**.**

Definition 1**.**

III Nash Equilibria Learning in Distributed Settings

Theorem 1**.**

IV Proof of Main Result

Lemma 1**.**

Proof.

Lemma 2**.**

Proof.

Lemma 3**.**

Proof.

Proof of Theorem 1.

V Comparison with the convergence rate of the GRANE

VI Simulation

VII conclusion

Assumption 1.

Assumption 2.

Remark 1.

Assumption 3.

Remark 2.

Definition 1.

Theorem 1.

Lemma 1.

Lemma 2.

Lemma 3.