Distributed generalized Nash equilibrium seeking in aggregative games on   time-varying networks

Giuseppe Belgioioso; Angelia Nedi\'c; Sergio Grammatico

arXiv:1907.00191·math.OC·June 16, 2022·IEEE Trans. Autom. Control.

Distributed generalized Nash equilibrium seeking in aggregative games on time-varying networks

Giuseppe Belgioioso, Angelia Nedi\'c, Sergio Grammatico

PDF

TL;DR

This paper introduces a fully-distributed algorithm for finding generalized Nash equilibria in aggregative games over dynamic networks, enabling agents to reach equilibrium without direct access to the aggregate decision.

Contribution

It presents the first fully-distributed algorithm for generalized Nash equilibrium seeking in aggregative games on time-varying networks with partial information.

Findings

01

Algorithm converges under monotone operator splitting framework

02

Works on time-varying communication networks

03

Handles partial-decision information scenarios

Abstract

We design the first fully-distributed algorithm for generalized Nash equilibrium seeking in aggregative games on a time-varying communication network, under partial-decision information, i.e., the agents have no direct access to the aggregate decision. The algorithm is derived by integrating dynamic tracking into a projected pseudo-gradient algorithm. The convergence analysis relies on the framework of monotone operator splitting and the Krasnosel'skii-Mann fixed-point iteration with errors.

Equations243

\overset{x}{ˉ} := \frac{1}{N} \sum_{i = 1}^{N} x_{j} .

\overset{x}{ˉ} := \frac{1}{N} \sum_{i = 1}^{N} x_{j} .

\displaystyle\textstyle\left\{\begin{array}[]{c l}\underset{x_{i}\in\mathbb{R}^{n}}{\operatorname{argmin}}&J_{i}(x_{i},\frac{1}{N}x_{i}+\frac{1}{N}\sum_{j\neq i}x_{j})\\ \text{s.t.}&x_{i}\in\Omega_{i}\\[1.99997pt] &C_{i}x_{i}-c_{i}\leq\sum_{j\neq i}^{N}(c_{j}-C_{j}x_{j})\end{array}\right.

\displaystyle\textstyle\left\{\begin{array}[]{c l}\underset{x_{i}\in\mathbb{R}^{n}}{\operatorname{argmin}}&J_{i}(x_{i},\frac{1}{N}x_{i}+\frac{1}{N}\sum_{j\neq i}x_{j})\\ \text{s.t.}&x_{i}\in\Omega_{i}\\[1.99997pt] &C_{i}x_{i}-c_{i}\leq\sum_{j\neq i}^{N}(c_{j}-C_{j}x_{j})\end{array}\right.

J_{i} (x_{i}^{*}, \overset{x}{ˉ}^{*}) \leq J_{i} (z, \frac{1}{N} z + \frac{1}{N} \sum_{j \neq = i}^{N} x_{j}^{*}), \forall z s.t. (z, x_{- i}^{*}) \in K .

J_{i} (x_{i}^{*}, \overset{x}{ˉ}^{*}) \leq J_{i} (z, \frac{1}{N} z + \frac{1}{N} \sum_{j \neq = i}^{N} x_{j}^{*}), \forall z s.t. (z, x_{- i}^{*}) \in K .

w_{i, j} (k) = ⎩ ⎨ ⎧ (max {∣ N_{i} (k) ∣, ∣ N_{j} (k) ∣})^{- 1} 0 1 - \sum_{ℓ \in N_{i}} w_{i, ℓ} (k) if (i, j) \in E_{k}, if (i, j) \neq \in E_{k}, if i = j .

w_{i, j} (k) = ⎩ ⎨ ⎧ (max {∣ N_{i} (k) ∣, ∣ N_{j} (k) ∣})^{- 1} 0 1 - \sum_{ℓ \in N_{i}} w_{i, ℓ} (k) if (i, j) \in E_{k}, if (i, j) \neq \in E_{k}, if i = j .

Ψ (k, s) = W (k) W (k - 1) \dots W (s + 1) W (s),

Ψ (k, s) = W (k) W (k - 1) \dots W (s + 1) W (s),

ρ := (1 - \frac{ϵ}{4 N ^{2}})^{1/ Q} \in (0, 1),

ρ := (1 - \frac{ϵ}{4 N ^{2}})^{1/ Q} \in (0, 1),

L_{i} (x, λ_{i}) := J_{i} (x_{i}, \overset{x}{ˉ}) + ι_{Ω_{i}} (x_{i}) + λ_{i}^{⊤} (C x - c),

L_{i} (x, λ_{i}) := J_{i} (x_{i}, \overset{x}{ˉ}) + ι_{Ω_{i}} (x_{i}) + λ_{i}^{⊤} (C x - c),

\forall i \in I : {0 \in \nabla_{x_{i}} J_{i} (x_{i}^{*}, \overset{x}{ˉ}^{*}) + N_{Ω_{i}} (x_{i}^{*}) + C_{i}^{⊤} λ_{i}^{*}, 0 \leq λ_{i}^{*} ⊥ - (C x^{*} - c) \geq 0.

\forall i \in I : {0 \in \nabla_{x_{i}} J_{i} (x_{i}^{*}, \overset{x}{ˉ}^{*}) + N_{Ω_{i}} (x_{i}^{*}) + C_{i}^{⊤} λ_{i}^{*}, 0 \leq λ_{i}^{*} ⊥ - (C x^{*} - c) \geq 0.

U : [x λ] \mapsto [N_{Ω} (x) + F (x) + C^{⊤} λ N_{R_{\geq 0}^{m}} (λ) - (C x - c)],

U : [x λ] \mapsto [N_{Ω} (x) + F (x) + C^{⊤} λ N_{R_{\geq 0}^{m}} (λ) - (C x - c)],

F (x) = col (\nabla_{x_{1}} J_{1} (x_{1}, \overset{x}{ˉ}), \dots, \nabla_{x_{N}} J_{N} (x_{N}, \overset{x}{ˉ})) .

F (x) = col (\nabla_{x_{1}} J_{1} (x_{1}, \overset{x}{ˉ}), \dots, \nabla_{x_{N}} J_{N} (x_{N}, \overset{x}{ˉ})) .

F_{i} (v, w) := (\frac{\partial}{\partial z _{1}} J_{i} (z_{1}, z_{2}) + \frac{1}{N} \frac{\partial}{\partial z _{2}} J_{i} (z_{1}, z_{2}))_{z_{1} = v z_{2} = w},

F_{i} (v, w) := (\frac{\partial}{\partial z _{1}} J_{i} (z_{1}, z_{2}) + \frac{1}{N} \frac{\partial}{\partial z _{2}} J_{i} (z_{1}, z_{2}))_{z_{1} = v z_{2} = w},

\displaystyle\boldsymbol{F}(\boldsymbol{v},\boldsymbol{w}):=\operatorname{col}\big{(}F_{1}(v_{1},w_{1}),\ldots,F_{N}(v_{N},w_{N})\big{)},

\displaystyle\boldsymbol{F}(\boldsymbol{v},\boldsymbol{w}):=\operatorname{col}\big{(}F_{1}(v_{1},w_{1}),\ldots,F_{N}(v_{N},w_{N})\big{)},

∥ F (v, w) - F (u, z) ∥ \leq L_{F} ∥ [v w] - [u z] ∥.

∥ F (v, w) - F (u, z) ∥ \leq L_{F} ∥ [v w] - [u z] ∥.

find ω^{*} = col (x^{*}, λ^{*}) s.t. 0 \in U (ω^{*}) .

find ω^{*} = col (x^{*}, λ^{*}) s.t. 0 \in U (ω^{*}) .

\displaystyle\begin{array}[]{l}\text{In parallel, for all }i\in\mathcal{I}:\\[1.99997pt] \quad\left|\begin{array}[]{l}x_{i}^{k+1}=\mathrm{proj}_{\Omega_{i}}\big{(}x_{i}^{k}-\alpha_{i}(F_{i}(x_{i}^{k},\bar{x}^{k})+C_{i}^{\top}\lambda^{k})\big{)}\\[3.00003pt] d_{i}^{k+1}=2C_{i}x_{i}^{k+1}-C_{i}x_{i}^{k}-c_{i}\\[5.0pt] \end{array}\right.\\[10.00002pt] \text{Central coordinator:}\\[1.99997pt] \quad\left|\;\,\lambda^{k+1}=\mathrm{proj}_{\mathbb{R}^{m}_{\geq 0}}\big{(}\lambda^{k}+\beta N\bar{d}^{\,k+1}\big{)}\right.\end{array}

\displaystyle\begin{array}[]{l}\text{In parallel, for all }i\in\mathcal{I}:\\[1.99997pt] \quad\left|\begin{array}[]{l}x_{i}^{k+1}=\mathrm{proj}_{\Omega_{i}}\big{(}x_{i}^{k}-\alpha_{i}(F_{i}(x_{i}^{k},\bar{x}^{k})+C_{i}^{\top}\lambda^{k})\big{)}\\[3.00003pt] d_{i}^{k+1}=2C_{i}x_{i}^{k+1}-C_{i}x_{i}^{k}-c_{i}\\[5.0pt] \end{array}\right.\\[10.00002pt] \text{Central coordinator:}\\[1.99997pt] \quad\left|\;\,\lambda^{k+1}=\mathrm{proj}_{\mathbb{R}^{m}_{\geq 0}}\big{(}\lambda^{k}+\beta N\bar{d}^{\,k+1}\big{)}\right.\end{array}

T : [x λ] \mapsto [N_{Ω} (x) + F (x) + \frac{1}{N} C_{f}^{⊤} λ N_{R^{m N}} (λ) + \L_{m} λ - \frac{1}{N} (C_{f} x - c_{f})],

T : [x λ] \mapsto [N_{Ω} (x) + F (x) + \frac{1}{N} C_{f}^{⊤} λ N_{R^{m N}} (λ) + \L_{m} λ - \frac{1}{N} (C_{f} x - c_{f})],

T_{1}

T_{1}

T_{2}

S := \frac{1}{N} [0 - C_{f} C_{f}^{⊤} 0] .

S := \frac{1}{N} [0 - C_{f} C_{f}^{⊤} 0] .

\displaystyle\left|\begin{array}[]{l}\text{Local projected pseudo-gradient update}:\\[1.99997pt] \left|\begin{array}[]{l}\tilde{x}_{i}^{k}=\mathrm{proj}_{\Omega_{i}}(x_{i}^{k}-\alpha_{i}(F_{i}(x_{i}^{k},\bar{x}^{k})+C_{i}^{\top}\bar{\lambda}^{k})),\\[3.00003pt] d_{i}^{k}=2C_{i}\tilde{x}_{i}^{k}-C_{i}x_{i}^{k}-c_{i},\\[3.00003pt] \tilde{\lambda}_{i}^{k}=\mathrm{proj}_{\mathbb{R}^{m}_{\geq 0}}\big{(}\lambda^{k}_{i}+\beta_{i}(\bar{d}^{k}-\lambda^{k}_{i}+\bar{\lambda}^{k})\big{)},\end{array}\right.\\[20.00003pt] \text{Local Krasnosel'skii--Mann process:}\\[1.99997pt] \left|\begin{array}[]{l}x_{i}^{k+1}=x_{i}^{k}+\gamma^{k}(\tilde{x}_{i}^{k}-x_{i}^{k}),\\[3.00003pt] \lambda_{i}^{k+1}=\lambda_{i}^{k}+\gamma^{k}(\tilde{\lambda}_{i}^{k}-\lambda_{i}^{k}),\end{array}\right.\end{array}\right.

\displaystyle\left|\begin{array}[]{l}\text{Local projected pseudo-gradient update}:\\[1.99997pt] \left|\begin{array}[]{l}\tilde{x}_{i}^{k}=\mathrm{proj}_{\Omega_{i}}(x_{i}^{k}-\alpha_{i}(F_{i}(x_{i}^{k},\bar{x}^{k})+C_{i}^{\top}\bar{\lambda}^{k})),\\[3.00003pt] d_{i}^{k}=2C_{i}\tilde{x}_{i}^{k}-C_{i}x_{i}^{k}-c_{i},\\[3.00003pt] \tilde{\lambda}_{i}^{k}=\mathrm{proj}_{\mathbb{R}^{m}_{\geq 0}}\big{(}\lambda^{k}_{i}+\beta_{i}(\bar{d}^{k}-\lambda^{k}_{i}+\bar{\lambda}^{k})\big{)},\end{array}\right.\\[20.00003pt] \text{Local Krasnosel'skii--Mann process:}\\[1.99997pt] \left|\begin{array}[]{l}x_{i}^{k+1}=x_{i}^{k}+\gamma^{k}(\tilde{x}_{i}^{k}-x_{i}^{k}),\\[3.00003pt] \lambda_{i}^{k+1}=\lambda_{i}^{k}+\gamma^{k}(\tilde{\lambda}_{i}^{k}-\lambda_{i}^{k}),\end{array}\right.\end{array}\right.

ω^{k + 1} = ω^{k} + γ^{k} (R (ω^{k}) - ω^{k}), (k \in N)

ω^{k + 1} = ω^{k} + γ^{k} (R (ω^{k}) - ω^{k}), (k \in N)

R := (Id + Φ^{- 1} T_{2})^{- 1} \circ (Id - Φ^{- 1} T_{1}),

R := (Id + Φ^{- 1} T_{2})^{- 1} \circ (Id - Φ^{- 1} T_{1}),

Φ := [α_{d}^{- 1} - \frac{1}{N} C_{f} - \frac{1}{N} C_{f}^{⊤} β_{d}^{- 1}],

Φ := [α_{d}^{- 1} - \frac{1}{N} C_{f} - \frac{1}{N} C_{f}^{⊤} β_{d}^{- 1}],

\displaystyle\tilde{\boldsymbol{x}}^{k}=\mathrm{proj}_{\boldsymbol{\Omega}}\big{(}\boldsymbol{x}^{k}-\alpha_{\text{d}}(\boldsymbol{F}(\boldsymbol{x}^{k},\bar{\boldsymbol{x}}^{k})+C_{\text{d}}^{\top}\bar{\boldsymbol{\lambda}}^{k})\big{)},

\displaystyle\tilde{\boldsymbol{x}}^{k}=\mathrm{proj}_{\boldsymbol{\Omega}}\big{(}\boldsymbol{x}^{k}-\alpha_{\text{d}}(\boldsymbol{F}(\boldsymbol{x}^{k},\bar{\boldsymbol{x}}^{k})+C_{\text{d}}^{\top}\bar{\boldsymbol{\lambda}}^{k})\big{)},

\displaystyle\tilde{\boldsymbol{\lambda}}^{k}=\mathrm{proj}_{\mathbb{R}^{mN}_{\geq 0}}\big{(}\boldsymbol{\lambda}^{k}+\beta_{\text{d}}(\bar{\boldsymbol{d}}^{k}-\boldsymbol{\lambda}^{k}+\bar{\boldsymbol{\lambda}}^{k})\big{)},

\overset{ˉ}{x}^{k} = 1 \otimes \overset{x}{ˉ}^{k}, \overset{ˉ}{λ}^{k} = 1 \otimes \overset{ˉ}{λ}^{k}, \overset{ˉ}{d}^{k} = 1 \otimes \overset{ˉ}{d}^{k}

\overset{ˉ}{x}^{k} = 1 \otimes \overset{x}{ˉ}^{k}, \overset{ˉ}{λ}^{k} = 1 \otimes \overset{ˉ}{λ}^{k}, \overset{ˉ}{d}^{k} = 1 \otimes \overset{ˉ}{d}^{k}

\overset{σ}{^}_{i}^{k} := j = 1 \sum N w_{i, j} (k) σ_{j}^{k}, \overset{y}{^}_{i}^{k} := j = 1 \sum N w_{i, j} (k) y_{j}^{k}

\overset{σ}{^}_{i}^{k} := j = 1 \sum N w_{i, j} (k) σ_{j}^{k}, \overset{y}{^}_{i}^{k} := j = 1 \sum N w_{i, j} (k) y_{j}^{k}

\overset{z}{^}_{i}^{k} := j = 1 \sum N w_{i, j} (k) z_{j}^{k}

\displaystyle\left|\begin{array}[]{l}\text{Communication and distributed averaging:}\\ \left|\begin{array}[]{l}\hat{\sigma}_{i}^{k}=\sum_{j=1}^{N}w_{i,j}(k)\sigma_{j}^{k},\\ \hat{y}^{k}_{i}=\sum_{j=1}^{N}w_{i,j}(k)y_{j}^{k},\\ \hat{z}^{k}_{i}=\sum_{j=1}^{N}w_{i,j}(k)z_{j}^{k},\end{array}\right.\\[20.00003pt] \text{Local strategy update and dynamic tracking of }\bar{d}^{k}:\\[1.99997pt] \left|\begin{array}[]{l}\tilde{x}_{i}^{k}=\mathrm{proj}_{\Omega_{i}}(x_{i}^{k}-\alpha_{i}(F_{i}(x_{i}^{k},\hat{\sigma}^{k}_{i})+C_{i}^{\top}\hat{z}^{k}_{i})),\\[3.00003pt] y_{i}^{k+1}=\hat{y}^{k}_{i}+C_{i}(2\tilde{x}_{i}^{k}-x_{i}^{k})-C_{i}(2\tilde{x}_{i}^{k-1}-x_{i}^{k-1}),\\[3.00003pt] \tilde{\lambda}_{i}^{k}=\mathrm{proj}_{\mathbb{R}^{m}_{\geq 0}}\big{(}\lambda^{k}_{i}+\beta_{i}(y_{i}^{k+1}-\lambda^{k}_{i}+\hat{z}^{k}_{i})\big{)},\end{array}\right.\\[20.00003pt] \text{Local Krasnosel'skii--Mann process:}\\[1.99997pt] \left|\begin{array}[]{l}x_{i}^{k+1}=x_{i}^{k}+\gamma^{k}(\tilde{x}_{i}^{k}-x_{i}^{k}),\\[3.00003pt] \lambda_{i}^{k+1}=\lambda_{i}^{k}+\gamma^{k}(\tilde{\lambda}_{i}^{k}-\lambda_{i}^{k}),\end{array}\right.\\[15.00002pt] \text{Local dynamic tracking of }\bar{x}^{k+1}\text{ and }\bar{\lambda}^{k+1}:\\[1.99997pt] \left|\begin{array}[]{l}\sigma_{i}^{k+1}=\hat{\sigma}_{i}^{k}+x_{i}^{k+1}-x_{i}^{k},\\ z_{i}^{k+1}=\hat{z}^{k}_{i}+\lambda_{i}^{k+1}-\lambda_{i}^{k}.\end{array}\right.\end{array}\right.

\displaystyle\left|\begin{array}[]{l}\text{Communication and distributed averaging:}\\ \left|\begin{array}[]{l}\hat{\sigma}_{i}^{k}=\sum_{j=1}^{N}w_{i,j}(k)\sigma_{j}^{k},\\ \hat{y}^{k}_{i}=\sum_{j=1}^{N}w_{i,j}(k)y_{j}^{k},\\ \hat{z}^{k}_{i}=\sum_{j=1}^{N}w_{i,j}(k)z_{j}^{k},\end{array}\right.\\[20.00003pt] \text{Local strategy update and dynamic tracking of }\bar{d}^{k}:\\[1.99997pt] \left|\begin{array}[]{l}\tilde{x}_{i}^{k}=\mathrm{proj}_{\Omega_{i}}(x_{i}^{k}-\alpha_{i}(F_{i}(x_{i}^{k},\hat{\sigma}^{k}_{i})+C_{i}^{\top}\hat{z}^{k}_{i})),\\[3.00003pt] y_{i}^{k+1}=\hat{y}^{k}_{i}+C_{i}(2\tilde{x}_{i}^{k}-x_{i}^{k})-C_{i}(2\tilde{x}_{i}^{k-1}-x_{i}^{k-1}),\\[3.00003pt] \tilde{\lambda}_{i}^{k}=\mathrm{proj}_{\mathbb{R}^{m}_{\geq 0}}\big{(}\lambda^{k}_{i}+\beta_{i}(y_{i}^{k+1}-\lambda^{k}_{i}+\hat{z}^{k}_{i})\big{)},\end{array}\right.\\[20.00003pt] \text{Local Krasnosel'skii--Mann process:}\\[1.99997pt] \left|\begin{array}[]{l}x_{i}^{k+1}=x_{i}^{k}+\gamma^{k}(\tilde{x}_{i}^{k}-x_{i}^{k}),\\[3.00003pt] \lambda_{i}^{k+1}=\lambda_{i}^{k}+\gamma^{k}(\tilde{\lambda}_{i}^{k}-\lambda_{i}^{k}),\end{array}\right.\\[15.00002pt] \text{Local dynamic tracking of }\bar{x}^{k+1}\text{ and }\bar{\lambda}^{k+1}:\\[1.99997pt] \left|\begin{array}[]{l}\sigma_{i}^{k+1}=\hat{\sigma}_{i}^{k}+x_{i}^{k+1}-x_{i}^{k},\\ z_{i}^{k+1}=\hat{z}^{k}_{i}+\lambda_{i}^{k+1}-\lambda_{i}^{k}.\end{array}\right.\end{array}\right.

\displaystyle\tilde{\boldsymbol{x}}^{k}\displaystyle=\mathrm{proj}_{\boldsymbol{\Omega}}\big{(}\boldsymbol{x}^{k}-\alpha_{\text{d}}(\boldsymbol{F}(\boldsymbol{x}^{k},\hat{\boldsymbol{\sigma}}^{k})+C_{\text{d}}^{\top}\hat{\boldsymbol{z}}^{k}\big{)},

\displaystyle\tilde{\boldsymbol{x}}^{k}\displaystyle=\mathrm{proj}_{\boldsymbol{\Omega}}\big{(}\boldsymbol{x}^{k}-\alpha_{\text{d}}(\boldsymbol{F}(\boldsymbol{x}^{k},\hat{\boldsymbol{\sigma}}^{k})+C_{\text{d}}^{\top}\hat{\boldsymbol{z}}^{k}\big{)},

\displaystyle\tilde{\boldsymbol{\lambda}}^{k}=\displaystyle\mathrm{proj}_{\mathbb{R}^{mN}_{\geq 0}}\big{(}\boldsymbol{\lambda}^{k}+\beta_{\text{d}}(\boldsymbol{y}^{k+1}-\boldsymbol{\lambda}^{k}+\hat{\boldsymbol{z}}^{k})\big{)}

\hat{σ}^{k} = W_{n} (k) σ^{k}, \hat{z}^{k} = W_{m} (k) z^{k}, \hat{y}^{k} = W_{m} (k) y^{k},

\hat{σ}^{k} = W_{n} (k) σ^{k}, \hat{z}^{k} = W_{m} (k) z^{k}, \hat{y}^{k} = W_{m} (k) y^{k},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Distributed generalized Nash equilibrium seeking in aggregative games on time-varying networks

Giuseppe Belgioioso, Angelia Nedić and Sergio Grammatico G. Belgioioso is with the Control Systems group, TU Eindhoven, The Netherlands. A. Nedić is with the School of Electrical, Computer, and Energy Engineering, Arizona State University, USA. S. Grammatico is with the Delft Center for Systems and Control (DCSC), TU Delft, The Netherlands. E-mail addresses: [email protected], [email protected], [email protected]. This work was partially supported by NWO (projects OMEGA, 613.001.702; P2P-TALES, 647.003.003), the ERC (project COSMOS, 802348) and by the Office of Naval Research grant no. N000141612245.

Abstract

We design the first fully-distributed algorithm for generalized Nash equilibrium seeking in aggregative games on a time-varying communication network, under partial-decision information, i.e., the agents have no direct access to the aggregate decision. The algorithm is derived by integrating dynamic tracking into a projected pseudo-gradient algorithm. The convergence analysis relies on the framework of monotone operator splitting and the Krasnosel’skii–Mann fixed-point iteration with errors.

I Introduction

An aggregative game is a collection of inter-dependent optimization problems associated with noncooperative decision makers, or agents, where each agent is affected by some aggregate effect of all the agents [1]. Remarkably, aggregative games arise in several applications, such as demand side management in the smart grid [2], e.g. for charging/discharging electric vehicles [3], demand-response regulation in competitive markets [4], congestion control in traffic and communication networks [5]. The common denominator is the presence of a large number of selfish agents, whose aggregate actions may disrupt the shared infrastructure, e.g. the power grid or the transportation network, if left uncontrolled.

Designing solution methods for multi-agent equilibrium problems in noncooperative games has recently gained high research interest. Several authors have developed semi-decentralized and distributed equilibrium seeking algorithms for games without coupling constraints [6] and, more recently, for games with coupling constraints [7, 8, 9, 10].

With focus on the generalized Nash equilibrium (GNE) problem, the formulations in [9, 10] have introduced an elegant approach based on monotone operator theory [11] to characterize the equilibrium solutions as the zeros of a monotone operator. Not only is the monotone-operator-theoretic approach general – e.g., unlike variational inequalities, smoothness of the cost functions is not required – but also computationally viable, since several algorithmic methods to solve monotone inclusions are already well established, e.g. operator-splitting methods [11, §26].

However, in the aforementioned literature on noncooperative equilibrium computation, it is assumed that the agents have direct access to the decisions of all their competitors, allowing every agent to evaluate its cost function without the need of extra communication. This game setup is known as full-decision information.

In aggregative games, this ideal scenario is achieved via the so-called semi-decentralized communication structure, where a central node gathers and broadcasts the aggregation variable to all the agents, see e.g. [7]-[9].

Recently, in the broader context of noncooperative games, the authors in [12, 13] propose fully-distributed algorithms for equilibrium seeking under partial-decision information, i.e., each agent can only observe the decision of some neighboring agents, while its cost function possibly depends on all the other agents’ decision. In [12], to deal with the lack of information, the agents are endowed with auxiliary variables, namely, the estimates of the decisions of the other agents. Then, a consensus protocol is combined with accelerated projected-pseudo-gradient dynamics to steer the estimates towards their real value and, consequently, the decisions to a Nash equilibrium, in the same time-scale. In [13], similar ideas are developed in the general framework of monotone operator theory to design an algorithm for games with coupling constraints. The algorithms proposed in [12, 13] require a number of auxiliary variables (i.e., the estimates of the decisions of all the other agents) which is proportional to the number of agents in the game. From a practical perspective, this can be regarded as a drawback in terms of memory storage and communication requirements, especially in games with very large number of agents.

Scalability with respect to the population size indeed motivates us to focus on aggregative games. In this context, the authors in [14] propose an algorithm that relies on dynamic tracking, a technique that allows a group of agents to locally track the average of some reference inputs, extensively used in distributed optimization for gradient tracking, e.g. [15]. Specifically, the authors embed dynamic tracking of the aggregate decision in a projected-pseudo-gradient update to compute a Nash equilibrium in a fully-distributed fashion (i.e., without the need of a central coordinator). In the context of aggregative games with coupling constraints, an algorithm is proposed in [16], however with important limitations: it requires a very large number of distributed communication rounds before each strategy update; convergence is guaranteed to approximate solutions (i.e., $\varepsilon-$ Nash equilibria) only; the communication network must be time-invariant.

More recently, two fully-distributed algorithms [17, 18], for generalized aggregative games over time-invariant and connected networks, have been proposed to compute an exact solution (i.e., GNE), without the need of multiple communication rounds before every strategy update. To cope with the lack of information, both algorithms introduce local estimates and dynamic tracking of the aggregate decision. In [17], global convergence is proved under strong monotonicity of the pseudo-gradient, by leveraging a rescrited-monotonicity property of this mapping in the exteneded space of strategies and estimates. In our preliminary work [18], this assumption is relaxed to cocoercivity at the cost of having vanishing step-sizes, which typically imply slow convergence. Unfortunately, the extension of both methodologies to cover time-varying communication networks is currently missing, since the operator theoretic framework on the basis of their convergence analysis fails when the underlying mappings vary over time.

Contribution

In this paper, we solve these technical issues and propose the first discrete-time, fully-distributed algorithm to compute a generalized Nash equilibrium in aggregative games with coupling constraints over a time-varying and repeatedly-connected communication network. The algorithm is obtained by combining dynamic tracking, projected-pseudo-gradient and Krasnosel’skii–Mann dynamics. The key approach to prove convergence of our proposed algorithm relies on applying and tailoring the framework of operator splitting methods [11] and fixed-point iteration with errors [19].

Organization of the paper

In Section II, we formalize the generalized Nash equilibrium seeking problem for aggregative games over a time-varying communication network. In Section III, we present a fully-distributed algorithm and discuss its interpretation from an operator theoretic and fixed-point perspective. In Section IV, we establish global convergence of the proposed method. To corroborate the theory, in Section V, we study the performance of the proposed method on a Nash–Cournot game. Concluding remarks and future research directions are discussed in Section VI.

Basic notation

$\mathbb{R}$ denotes the set of real numbers, and $\overline{\mathbb{R}}:=\mathbb{R}\cup\{\infty\}$ the set of extended real numbers. $\boldsymbol{0}$ ( $\boldsymbol{1}$ ) denotes a matrix/vector with all elements equal to [math] ( $1$ ); to improve clarity, we may add the dimension of these matrices/vectors as subscript. Given two sets, $\mathcal{S}_{1}$ and $\mathcal{S}_{2}$ , we denote as $\mathcal{S}_{1}\times\mathcal{S}_{2}$ their Cartesian product. Given $N$ sets, $\mathcal{S}_{1},\ldots,\mathcal{S}_{N}$ , we denote with $\mathrm{conv}(\mathcal{S}_{1},\ldots,\mathcal{S}_{N})=\big{\{}a_{1}x_{1}+\ldots+a_{N}x_{N}\,|\;\sum_{i=1}^{N}a_{i}=1,\,a_{i}\in\mathbb{R}_{\geq 0},\,x_{i}\in\mathcal{S}_{i},\,\forall i\in\{1,\ldots,N\}\big{\}}$ the convex hull of their union. $A\otimes B$ denotes the Kronecker product between the matrices $A$ and $B$ . For a square matrix $A=[a_{i,j}]\in\mathbb{R}^{n\times n}$ , where $a_{i,j}$ is the entry in position $(i,j)$ , its transpose is $A^{\top}$ ; $A\succ 0$ ( $\succeq 0$ ) stands for positive definite (semidefinite) matrix; $\left\|A\right\|$ denotes the largest singular value of $A$ ; $\left\|A\right\|_{\infty}=\max_{1\leq i\leq n}\sum_{i=1}^{n}|a_{i,j}|$ denotes the infinity norm. If $A\succ 0$ , $\|\cdot\|_{A}$ denotes the $A$ -induced norm, such that $\|x\|_{A}=\sqrt{x^{\top}Ax}$ , we omit the subscript when $A=I$ . Given $N$ matrices $A_{1},\ldots,A_{N}$ , $\textrm{blkdiag}(A_{1},\ldots,A_{N})$ denotes a block diagonal matrix with $A_{1},\ldots,A_{N}$ as diagonal blocks. Given $N$ vectors $x_{1},\ldots,x_{N}$ , $\boldsymbol{x}:=\operatorname{col}\left(x_{1},\ldots,x_{N}\right)=[x_{1}^{\top},\ldots,x_{N}^{\top}]^{\top}$ , $\bar{x}=\frac{1}{N}\sum_{i=1}^{N}x_{i}$ , $\boldsymbol{x}_{-i}:=\operatorname{col}(x_{1},\ldots,x_{i-1},x_{i+1},\ldots,x_{N})$ ; given a vector $z$ , $(z,\boldsymbol{x}_{-i}):=\operatorname{col}(x_{1},\ldots,x_{i-1},z,x_{i+1},\ldots,x_{N})$ .

Operator theoretic definitions

$\mathrm{Id}(\cdot)$ denotes the identity operator. The mapping $\iota_{S}:\mathbb{R}^{n}\rightarrow\{0,\,\infty\}$ denotes the indicator function for the set $S\subseteq\mathbb{R}^{n}$ , i.e., $\iota_{S}(x)=0$ if $x\in S$ , $\infty$ otherwise. For a closed set $S\subseteq\mathbb{R}^{n}$ , the mapping $\mathrm{proj}_{S}:\mathbb{R}^{n}\rightarrow S$ denotes the projection onto $S$ , i.e., $\mathrm{proj}_{S}(x)=\operatorname{argmin}_{y\in S}\left\|y-x\right\|$ . The set-valued mapping $\mathrm{N}_{S}:\mathbb{R}^{n}\rightrightarrows\mathbb{R}^{n}$ denotes the normal cone operator for the set $S\subseteq\mathbb{R}^{n}$ , i.e., $\mathrm{N}_{S}(x)=\varnothing$ if $x\notin S$ , $\left\{v\in\mathbb{R}^{n}\mid\sup_{z\in S}\,v^{\top}(z-x)\leq 0\right\}$ otherwise. For a function $\psi:\mathbb{R}^{n}\rightarrow\overline{\mathbb{R}}$ , $\operatorname{dom}(\psi):=\{x\in\mathbb{R}^{n}\mid\psi(x)<\infty\}$ ; $\partial\psi:\operatorname{dom}(\psi)\rightrightarrows{\mathbb{R}}^{n}$ denotes its subdifferential set-valued mapping, defined as $\partial\psi(x):=\{v\in\mathbb{R}^{n}\mid\psi(z)\geq\psi(x)+v^{\top}(z-x)\textup{ for all }z\in{\rm dom}(\psi)\}$ . A set-valued mapping $\mathcal{F}:\mathbb{R}^{n}\rightrightarrows\mathbb{R}^{n}$ is (strictly) monotone if $(u-v)^{\top}(x-y)\geq(>)\,0$ for all $x\neq y\in\mathbb{R}^{n}$ , $u\in\mathcal{F}(x)$ , $v\in\mathcal{F}(y)$ ; $\mathcal{F}$ is restricted-(strictly) monotone with respect to (w.r.t.) $Y\subset\mathbb{R}^{n}$ if $(z^{*}-z)^{\top}(x^{*}-x)\geq(>)0$ for all $\forall\boldsymbol{x}^{*}\in Y$ , $\boldsymbol{x}\in\mathbb{R}^{n}\setminus Y$ , $\boldsymbol{z}^{*}\in\mathcal{F}(x^{*})$ , $\boldsymbol{x}\in\mathcal{F}(x)$ ; $\mathcal{F}$ is $\eta-$ strongly monotone, with $\eta>0$ , if $(u-v)^{\top}(x-y)\geq\eta\left\|x-y\right\|^{2}$ for all $x\neq y\in\mathbb{R}^{n}$ , $u\in\mathcal{F}(x)$ , $v\in\mathcal{F}(y)$ ; $\mathrm{fix}\left(\mathcal{F}\right):=\left\{x\in\mathbb{R}^{n}\mid x\in\mathcal{F}(x)\right\}$ and $\operatorname{zer}\left(\mathcal{F}\right):=\left\{x\in\mathbb{R}^{n}\mid 0\in\mathcal{F}(x)\right\}$ denote the set of fixed points and of zeros, respectively. A single-valued mapping $F:\mathbb{R}^{n}\rightarrow\mathbb{R}^{n}$ is $L$ -Lipschitz continuous, with $L>0$ , if $\|F(x)-F(y)\|\leq L\|x-y\|$ for all $x,y\in\mathbb{R}^{n}$ ; $F$ is nonexpansive if it is $1$ -Lipschitz continuous; $F$ is $\eta$ -averaged, with $\eta\in(0,1)$ , if $\left\|F(x)-F(y)\right\|^{2}\leq\left\|x-y\right\|^{2}-\tfrac{1-\eta}{\eta}\left\|\left(\textup{Id}-F\right)(x)-\left(\textup{Id}-F\right)(y)\right\|^{2}$ , for all $x,y\in\mathbb{R}^{n}$ ; $F$ is $\beta$ -cocoercive, with $\beta>0$ , if $\beta F$ is $\tfrac{1}{2}$ -averaged.

II Problem statement

Consider a set of $N$ agents indexed by $\mathcal{I}=\{1,\ldots,N\}$ . The $i$ -th agent is characterized by a local strategy set $\Omega_{i}\subset\mathbb{R}^{n}$ and a cost function $J_{i}(x_{i},\bar{x})$ , which depends on the decision of agent $i$ , $x_{i}$ , and on the aggregate of all agent decisions, i.e.,

[TABLE]

Moreover, we assume that the collective strategy profile $\boldsymbol{x}:=\operatorname{col}(x_{1},\ldots,x_{N})\in\mathbb{R}^{nN}$ must satisfy a coupling constraint, described by the affine function $\boldsymbol{x}\mapsto C\boldsymbol{x}-c$ , where $C=[C_{1}|\ldots|C_{N}]\in\mathbb{R}^{m\times nN}$ , $c=\sum_{i=1}^{N}c_{i}\in\mathbb{R}^{m}$ , and $C_{i}$ , $c_{i}$ are local parameters known to agent $i$ only. In summary, the aim of each agent $i$ , given the decision variables of the other agents, i.e., $\boldsymbol{x}_{-i}:=\operatorname{col}(x_{1},\ldots,x_{i-1},x_{i+1},\ldots,x_{N})$ , is to choose a strategy $x_{i}$ that solves its local optimization problem, according to the game setup above, i.e., $\forall i\in\mathcal{I}:$

[TABLE]

where the last constraint is equivalent to $C\boldsymbol{x}-c\leq\mathbf{0}$ .

Remark 1

Affine coupling constraints, as considered in this paper, are very common in the literature of noncooperative games, e.g. [8, 10, 13, 16], and cover several applications where they typically arise in the form of upper and lower limits on the available shared resources, e.g. [2]-[5]. $\square$

Assumption 1

For all $i\in\mathcal{I}$ and any fixed $u\in\frac{1}{N}\sum_{j\neq i}^{N}\Omega_{j}$ , the function $J_{i}(\cdot\,,\frac{1}{N}\cdot+\,u)$ is convex and continuously differentiable, $\Omega_{i}\subset\mathbb{R}^{n}$ is non-empty, compact and convex. The global feasible set $K:=\{\boldsymbol{x}\in\prod_{i=1}^{N}\Omega_{i}|\,C\boldsymbol{x}-c\leq\mathbf{0}\}$ is non-empty and satisfies Slater’s constraint qualification. $\square$

From a game-theoretic perspective, our goal is to distributively compute a generalized Nash equilibrium of the aggregative game described by the $N$ inter-dependent optimization problems in (4).

Definition 1 (Generalized Nash equilibrium)

*A collective strategy $\boldsymbol{x}^{*}\in K$ is a generalized Nash equilibrium (GNE) of the game in (4) if, for all $i\in\mathcal{I}$ : *

[TABLE]

II-A Communication networks

We consider a time-varying network to model the communications among agents over time. At each stage $k$ , the communication is described by an undirected graph $\mathcal{G}_{k}=(\mathcal{I},\mathcal{E}_{k})$ , where $\mathcal{I}$ is the set of vertices (agents) and $\mathcal{E}_{k}\subseteq\mathcal{I}\times\mathcal{I}$ is the set of edges. An unordered pair of vertices $(i,j)$ belongs to $\mathcal{E}_{k}$ if and only if agents $j$ and $i$ can exchange information. The set of neighbors of agent $i$ at stage $k$ is defined as $\mathcal{N}_{i}(k)=\{j|\,(i,j)\in\mathcal{E}_{k}\}$ . Next, we assume the graphs sequence $\{\mathcal{G}_{k}\}_{k\in\mathbb{N}}$ to be $Q-$ connected.

Assumption 2

There exists an integer $Q\geq 1$ such that the graph $(\mathcal{I},\cup_{\ell=1}^{Q}\mathcal{E}_{\ell+k})$ is connected, for all $k\geq 0$ . $\square$

This assumption ensures that the intercommunication intervals are bounded for agents that communicate directly. In other words, every agent sends information to each of its neighboring agents at least once every $Q$ time intervals.

We consider a mixing matrix $W(k)=[w_{i,j}(k)]$ associated with $\mathcal{G}_{k}$ , whose elements satisfy the following assumption.

Assumption 3

For all $k\in\mathbb{N}$ , the matrix $W(k)=[w_{i,j}(k)]$ satisfies the following conditions:

(i)

(Edge utilization) Let $i,j\in\mathcal{I}$ , $i\neq j$ . If $(i,j)\in\mathcal{E}_{k}$ , $w_{i,j}(k)\geq\epsilon$ , for some $\epsilon>0$ ; $w_{i,j}(k)=0$ otherwise; 2. (ii)

(Positive diagonal) For all $i\in\mathcal{I}$ , $w_{i,i}(k)>\epsilon$ ; 3. (iii)

(Double-stochasticity) $W(k)\mathbf{1}=\mathbf{1}$ , $\mathbf{1}^{\top}W(k)=\mathbf{1}^{\top}$ . $\square$

Assumption 3 is strong but typical for multiagent coordination and optimization, e.g. [15, 20]. For an undirected graph it can be fulfilled, for example, by using Metropolis weights:

[TABLE]

Finally, let us introduce the so-called transition matrices $\Psi(k,s)$ from time $s$ to $k$ :

[TABLE]

for $0\leq s<k$ , where $\Psi(k,k)=W(k)$ , for all $k$ . The following statement shows the convergence properties of the transition matrix $\Psi(k,s)$ .

Lemma 1 ([21, Lemma 5.3.1])

Let Assumptions 2, 3 hold true. Then, the following statements hold:

(i)

$\lim_{k\rightarrow\infty}\Psi(k,s)=(1/N)\mathbf{1}\mathbf{1}^{\top}$ , for all $s\geq 0$ . 2. (ii)

The convergence rate of $\Psi(k,s)$ is geometric, i.e., $\|\Psi(k,s)-(1/N)\mathbf{1}\mathbf{1}^{\top}\|\leq\theta\rho^{k-s}$ for all $k\geq s\geq 0$ , where $\theta:=N(1-\epsilon/(4N^{2}))^{-2}$ and

[TABLE]

with $Q$ as in Assumption 2 and $\epsilon$ as in Assumption 3. $\square$

II-B GNE as zeros of a monotone operator

As first step, we characterize a GNE of the game in terms of the KKT conditions of the coupled optimization problems in (4). For each agent $i\in\mathcal{I}$ , let us introduce the Lagrangian function $L_{i}$ , defined as

[TABLE]

where $\lambda_{i}\in\mathbb{R}^{m}_{\geq 0}$ is the dual variable of agent $i$ associated with the coupling constraints, and $\iota_{\Omega_{i}}$ is the indicator function. It follows from [22, §12.2.3] that the set of strategies $\boldsymbol{x}^{*}$ is a GNE of the game in (4) if and only if the following coupled KKT conditions are satisfied for some $\lambda_{1},\ldots,\lambda_{N}\in\mathbb{R}^{m}_{\geq 0}$ :

[TABLE]

Within all the possible GNE, we focus on an important subclass of equilibria, namely the variational GNE (v-GNE), that enjoy some relevant structural properties, such as “larger social stability” and “economic fairness” and corresponds to the solution set of the KKT conditions in (8) with equal dual variables, i.e., $\lambda_{1}^{*}=\ldots=\lambda_{N}^{*}$ [23, Theorem 3.1]. The next proposition characterizes the subclass of v-GNE as the solution to a specific variational inequality problem111For a single-valued mapping $M:\mathbb{R}^{n}\rightarrow\mathbb{R}^{n}$ and a set $\mathcal{S}\subseteq\mathbb{R}^{n}$ , the variational inequality problem VI $(M,\mathcal{S})$ is the problem of finding a vector $\omega^{*}\in\mathcal{S}$ such that $M(\omega^{*})^{\top}(\omega-\omega^{*})\geq 0$ , for all $\omega\in\mathcal{S}$ , [24, Def. 1.1.1]. , or equivalently as the zero set of the set-valued mapping

[TABLE]

where $\lambda\in\mathbb{R}^{m}$ , $\boldsymbol{\Omega}:=\prod_{i=1}^{N}\Omega_{i}$ , $\mathrm{N}_{\mathcal{S}}=\partial\iota_{\mathcal{S}}$ is the normal cone operator associated with a set $\mathcal{S}$ and $F$ is the so-called pseudo-gradient mapping (PG) defined as

[TABLE]

Proposition 1

Let Assumption 1 hold. Then, the following statements are equivalent:

(i)

$\boldsymbol{x}^{*}$ * is a variational GNE of the game in (4);* 2. (ii)

$\exists\lambda^{*}\in\mathbb{R}^{m}_{\geq 0}$ * such that, the pair $(x_{i}^{*},\lambda^{*})$ is a solution to the KKT in (8), for all $i\in\mathcal{I}$ ;* 3. (iii)

$\boldsymbol{x}^{*}$ * is a solution to VI* $(F,K)$ ; 4. (iv)

$\exists\lambda^{*}\in\mathbb{R}^{m}_{\geq 0}$ * such that $\operatorname{col}(\boldsymbol{x}^{*},\lambda^{*})\in\operatorname{zer}(U)$ . $\square$ *

Proof:

The equivalences (i) $\Leftrightarrow$ (ii) $\Leftrightarrow$ (iii) are proven in [23, Th. 3.1] while (iii) $\Leftrightarrow$ (iv) follows by [25, Th. 3.1]. ∎

The following assumptions on the PG in (10) are standard (e.g. [8, Th. 3], [10, Assumption 2], [26, Assumption 3]) and sufficient to ensure the convergence of standard GNE seeking algorithms based on projected-pseudo-gradient dynamics.

Assumption 4

$F$ * in (10) is $\chi-$ cocoercive over $\boldsymbol{\Omega}$ . $\square$ *

When $F$ is $\xi-$ strongly monotone and $L_{\text{F}}-$ Lipschitz, then $F$ is also $(\xi/L_{\text{F}}^{2})-$ cocoercive. However, in general, cocoercive mappings are not necessarily strongly monotone, e.g. the gradient of a (non-strictly) convex and smooth function.

To emphasize the structure of $F$ in (10), we define

[TABLE]

that satisfies $F_{i}(x_{i},\bar{x})=\nabla_{x_{i}}J_{1}(x_{i},\bar{x})$ , for all $i\in\mathcal{I}$ . Then, we define the extended pseudo-gradient mapping (EPG)

[TABLE]

where each component mapping $F_{i}$ is given by (11). With this notation, we have $\boldsymbol{F}(\boldsymbol{x},\mathbf{1}\otimes\bar{x})=F(\boldsymbol{x})$ . Next, we assume Lipschitz continuity of the EPG, which is usual in the context of games under partial-decision information, see e.g. [13, Assumption 3], [14, Assumption 3], [17, Assumption 4].

Assumption 5

Let $\bar{\Omega}:=\mathrm{conv}\left(\Omega_{1},\ldots,\Omega_{N}\right)$ be the set whose elements are convex combination of the elements from the local sets $\Omega_{i}$ ’s. The mapping $\boldsymbol{F}$ in (12) is uniformly Lipschitz continuous over $\boldsymbol{\Omega}\times\bar{\boldsymbol{\Omega}}$ , with $\bar{\boldsymbol{\Omega}}=\prod_{i=1}^{N}\bar{\Omega}$ , i.e., there exists $L_{\boldsymbol{F}}>0$ such that, for all $\boldsymbol{v},\boldsymbol{u}\in\boldsymbol{\Omega}$ and $\boldsymbol{w},\boldsymbol{z}\in\bar{\boldsymbol{\Omega}}$ ,

[TABLE]

$\square$ **

Remark 2 (Existence and uniqueness of a v-GNE)

It follows by [27, Cor. 2.2.5] that VI( $F$ , $K$ ) has a non-empty and compact solution set, since $K$ is non-empty, compact and convex and $F$ is continuous, by Assumption 1. Furthermore, when $F$ is strictly monotone, then the solution to VI( $F$ , $K$ ), (i.e., the v-GNE of the game), is unique [27, Th. 2.3.3]. $\square$

II-C Boundedness of the dual variables

In the next statement, we formally establish the boundedness of the dual solution set of VI( $F,K$ ) or, equivalently, of the dual part of the monotone inclusion $\operatorname{col}(\boldsymbol{x}^{*},\lambda^{*})\in\operatorname{zer}(U)$ .

Lemma 2

Let Assumptions 1 hold true. If $\operatorname{col}(\boldsymbol{x}^{*},\lambda^{*})\in\operatorname{zer}(U)$ , then $\lambda^{*}\in D^{*}$ , where $D^{*}\subset\mathbb{R}^{m}_{\geq 0}$ is bounded. $\square$

Proof:

The boundedness of the dual solution set $D^{*}$ follows by [25, Proposition 3.3] since VI( $F$ , $K$ ) has a non-empty bounded solution set by Remark 2 and there exists a vector $\boldsymbol{x}\in\operatorname{dom}(F)$ satisfying Slater’s constraint qualification by Assumption 1. ∎

Let us denote with $B_{D^{*}}=\max_{\lambda\in D^{*}}\|\lambda\|_{\infty}$ the largest entry of all the optimal dual vectors. The agents can locally build a bounded superset $D_{\text{i}}$ of the optimal dual set $D^{*}$ as follows: $D_{\text{i}}:=\{\mu\in\mathbb{R}^{m}_{\geq 0}\,|\;\|\mu\|_{\infty}\leq B_{D^{*}}+r,\;\text{ with }r>0\}$ [28, p. 21]. In the context of distributed constrained optimization, a local estimate of $B_{D^{*}}$ can be constructed based on a Slater’s vector, see [29, §4.2], [30, §3.A (2)]. The extension of these estimation methods to generalized noncooperative games would rely on Lagrangian duality theory for variational inequalities [25]. In practice, each agent does not need an accurate estimate of the optimal dual solution set $D^{*}$ and can simply construct a local superset $D_{\text{i}}$ by taking $r$ large enough.

II-D A standard semi-decentralized algorithm

It follows by Proposition 1 that the original GNE seeking problem corresponds to the following monotone inclusion problem:

[TABLE]

Next, we recall a standard semi-decentralized GNE seeking algorithm obtained by solving the monotone inclusion problem in (13) by means of a preconditioned forward-backward (pFB) splitting [26, Alg. 1].

Algorithm 1. Semi-decentralized v-GNE seeking

Iterate until convergence

[TABLE]

Remark 3

The local auxiliary variables $d_{i}$ ’s are introduced to cast Algorithm 1 in a more compact form. The average $\bar{d}^{\,k+1}:=\frac{1}{N}\sum_{i=1}^{N}(2C_{i}x_{i}^{k+1}-C_{i}x_{i}^{k}-c_{i})$ measures the violation of the coupling constraints, technically, is the “reflected violation” of the constraints at iteration $k$ . $\square$

If the step sizes $\{\alpha_{i}\}_{i\in\mathcal{I}}$ and $\beta$ are chosen small enough, then the sequence $(\operatorname{col}(\boldsymbol{x}^{k},\lambda^{k}))_{k\in\mathbb{N}}$ generated by Algorithm 1 converges to some $\operatorname{col}(\boldsymbol{x}^{*},\lambda^{*})\in\operatorname{zer}(U)$ , where $\boldsymbol{x}^{*}$ is a v-GNE, see [26, Th. 1] for a formal proof of convergence.

We note that Algorithm 1 is not distributed. In fact, at each iteration $k$ , a central coordinator is needed to:

(i)

gather and broadcast the average strategy $\bar{x}^{k}$ ; 2. (ii)

gather the average quantity $\bar{d}^{k}$ ; 3. (iii)

update and broadcast the dual variable $\lambda^{k}$ .

III A distributed GNE seeking algorithm

III-A Towards a fully distributed algorithm

A first step towards a fully-distributed algorithm consists of endowing each agent with a copy, $\lambda_{i}$ , of the dual variable and enforcing consensus on the local copies. Consider the set-valued mapping $T$ , obtained by augmenting $U$ in (9) with the local copies of the dual variable:

[TABLE]

where $\boldsymbol{\lambda}=\operatorname{col}(\lambda_{1},\ldots,\lambda_{N})$ , $C_{\text{f}}=\mathbf{1}_{N}\otimes C$ , $c_{\text{f}}=\mathbf{1}\otimes c$ , $\L_{m}=\L\otimes I_{m}$ and $\L:=I_{N}-\frac{1}{N}\mathbf{1}\mathbf{1}^{\top}$ represents the projection onto the disagreement space.

Remark 4

When the local copies of the dual variable are equal, i.e., $\boldsymbol{\lambda}\in\boldsymbol{E}^{\parallel}:=\{\mathbf{1}_{N}\otimes\lambda,\,|\,\lambda\in\mathbb{R}^{m}\}$ , where $\boldsymbol{E}^{\parallel}$ is the consensus subspace of dimension $m$ , the first row block of $T$ corresponds to that of $U$ , while each of the $N$ components of the second row block of $T$ describes the same complementarity condition, namely, the second row block of $U$ . $\square$

We note that the mapping $T$ in (14) can be written as the sum of two operators, i.e.,

[TABLE]

where $S$ is a skew-symmetric linear mapping defined as

[TABLE]

The formulation $T=T_{1}+T_{2}$ is called splitting of $T$ , and will be exploited in different ways later on. The next lemma shows that $T_{2}$ is maximally monotone and that $T_{1}$ is cocoercive and strictly monotone with respect to the consensus subspace of the dual variables, i.e., $\boldsymbol{\Omega}\times\boldsymbol{E}^{\parallel}$ .

Lemma 3

Let Assumptions 1, 4 hold true. The following statements hold:

(i)

$T_{2}$ * in (16) is maximally monotone on $\boldsymbol{\Omega}\times\mathbb{R}^{mN}_{\geq 0}$ ;* 2. (ii)

$T_{1}$ * in (15) is * $\delta-$ cocoercive, with $0\!<\!\delta\!\leq\!\min\{1,\chi\}$ and restricted-strictly monotone w.r.t. $\Theta^{\parallel}:=\boldsymbol{\Omega}\times\boldsymbol{E}^{\parallel}$ , i.e., for all $\boldsymbol{\omega}^{\parallel}\in\Theta^{\parallel}$ , $\boldsymbol{\omega}\in(\boldsymbol{\Omega}\times\mathbb{R}^{mN}_{\geq 0})\setminus\Theta_{\parallel}$ , it holds that $(T_{1}(\boldsymbol{\omega})-T_{1}(\boldsymbol{\omega}^{\parallel}))^{\top}(\boldsymbol{\omega}-\boldsymbol{\omega}^{\parallel})>0$ ; 3. (iii)

$T$ * is maximally monotone on $\boldsymbol{\Omega}\times\mathbb{R}^{mN}_{\geq 0}$ and restricted-strictly monotone w.r.t. $\Theta^{\parallel}$ . $\square$ *

Proof:

See Appendix -A. ∎

The next proposition exploits the restricted-strict monotonicity of $T$ to shows that the v-GNE of the original game are fully characterized by the zeros of $T$ .

Proposition 2

Let Assumption 1 hold true. The following statements hold:

(i)

$\operatorname{zer}(T)\neq\varnothing$ , 2. (ii)

If $\operatorname{col}(\boldsymbol{x}^{*},\boldsymbol{\lambda}^{*})\in\operatorname{zer}(T)$ , then $\boldsymbol{x}^{*}$ is a v-GNE and $\boldsymbol{\lambda}^{*}=\operatorname{col}(\lambda^{*},\ldots,\lambda^{*})$ , with $\lambda^{*}\in\mathbb{R}^{m}_{\geq 0}$ . $\square$

Proof:

See Appendix -B. ∎

To find a zero of $T$ , we exploit a preconditioned version of the forward-backward method [11, §25.6] on the splitting (15)-(16), similarly to [10, 26], thus obtaining Algorithm 2.

The next theorem establishes global convergence of Algorithm 2 to a v-GNE if the step-sizes are chosen according to the following choices.

Assumption 6

Take $0<\delta\leq\min\{1,\chi\}$ , where $\chi$ as in Assumption 4. Set the global parameter $\tau>\frac{1}{2\delta}$ and denote $\nu:=\frac{2\delta\tau}{4\delta\tau-1}\in(1/2,1)$ . Set the step-sizes as follows:

(i)

$0<\alpha_{i}\leq(\|C_{i}\|+\tau)^{-1}$ , for all $i\in\mathcal{I}$ , 2. (ii)

$0<\beta_{i}\leq(\frac{1}{N}\sum_{j=1}^{N}\|C_{j}\|+\tau)^{-1}$ , for all $i\in\mathcal{I}$ , 3. (iii)

$(\gamma^{k})_{k\in\mathbb{N}}$ * such that $\gamma^{k}\in[0,\nu^{-1}]$ for all $k\in\mathbb{N}$ and $\sum_{k=0}^{\infty}\gamma^{k}(1-\nu\gamma^{k})=\infty$ . $\square$ *

Note that the design choice $\gamma^{k}=1$ , for all $k\in\mathbb{N}$ , always satisfies Assumption 6 (iii).

Theorem 1

Let Assumptions 1, 4 hold. If the step-sizes $\{\alpha_{i},\beta_{i}\}_{i\in\mathcal{I}}$ and $(\gamma^{k})_{k\in\mathbb{N}}$ are set as in Assumption 6, then the sequence $(\operatorname{col}(\boldsymbol{x}^{k},\boldsymbol{\lambda}^{k}))_{k\in\mathbb{N}}$ generated by Algorithm 2 converges to some $\operatorname{col}(\boldsymbol{x}^{*},\boldsymbol{\lambda}^{*})\in\operatorname{zer}(T)$ , where $\boldsymbol{x}^{*}$ is a v-GNE of the game in (4). $\square$

Proof:

See Appendix -C. ∎

Remark 5 (Algorithm 2 as a fixed-point iteration)

Our convergence analysis is based on the same operator theoretic framework in [10]-[26]. Specifically, we recast the dynamics generated by Algorithm 2 as the fixed-point iteration

[TABLE]

where $\boldsymbol{\omega}^{k}=\operatorname{col}(\boldsymbol{x}^{k},\boldsymbol{\lambda}^{k})$ is the stacked vector of the iterates and $R$ is the so-called pFB operator, defined as

[TABLE]

*where $T_{1}$ , $T_{2}$ in (15)-(16) characterize the splitting of $T$ , and $\Phi$ is the so-called preconditioning matrix, here chosen as *

[TABLE]

$\alpha_{\text{d}}:=\operatorname{diag}(\alpha_{1},\ldots,\alpha_{N})\otimes I_{n}$ , $\beta_{\text{d}}:=\operatorname{diag}(\beta_{1},\ldots,\beta_{N})\otimes I_{n}$ . Then, we show that, if the step sizes in the main diagonal of $\Phi$ are set according to Assumption 6, the mapping $R$ is averaged with respect to the $\Phi$ -induced norm, i.e., $\|\cdot\|_{\Phi}$ . Hence, the fixed-point iteration (18) converges to some $\boldsymbol{\omega}^{*}:=\operatorname{col}(\boldsymbol{x}^{*},\boldsymbol{\lambda}^{*})\in\mathrm{fix}(R)=\operatorname{zer}(T)$ , where $\boldsymbol{x}^{*}$ is a v-GNE. See Appendix -C for a complete convergence analysis. $\square$

To conclude this section, we note that the projected-pseudo-gradient updates in Algorithm 2 can be cast compactly as

[TABLE]

where

[TABLE]

and $C_{\text{d}}:=\textrm{blkdiag}(C_{1},\ldots,C_{N})$ .

Unlike Algorithm 1, Algorithm 2 does not directly rely on the actions of a central coordinator, namely, dual update and broadcast communication. However, it requires an all-to-all information exchange (or, equivalently, a complete communication graph) at each iteration $k$ , since the local updating rule of each agent necessitates the knowledge of:

(i)

the average strategy $\bar{x}^{k}$ , 2. (ii)

the average dual variable $\bar{\lambda}^{k}$ , 3. (iii)

the average quantity $\bar{d}^{k}$ .

III-B A fully-distributed algorithm via dynamic tracking

To implement Algorithm 2 fully-distributively under the more realistic communication assumptions in Section II-A, we approximate its updates by endowing each agent $i$ with some surrogate variables (or estimates), i.e., $\sigma_{i}$ , $y_{i}$ and $z_{i}$ , that dynamically track the averages $\bar{x}^{k}$ , $\bar{d}^{k}$ and $\bar{\lambda}^{k}$ , respectively. Then, to mitigate the errors due to the inexactness of the surrogate variables, we relax the projected-pseudo-gradient iterations by means of a Krasnosel’skii–Mann (KM) process [11, eq.(5.12)], whose step-sizes are set according to the following design choice.

Assumption 7

The sequence $(\gamma^{k})_{k\in\mathbb{N}}$ satisfies the following conditions:

(i)

(non-increasing) $0\leq\gamma^{k+1}\leq\gamma^{k}\leq 1$ , for all $k\geq 0$ ; 2. (ii)

(non-summable) $\sum_{k=0}^{\infty}\gamma^{k}=\infty$ ; 3. (iii)

(square-summable) $\sum_{k=0}^{\infty}{(\gamma^{k})}^{2}<\infty$ . $\square$

For example, Assumption 7 is satisfied for step sizes of the form $\gamma^{k}=(k+1)^{-b}$ where $\frac{1}{2}<b\leq 1$ .

The proposed algorithm relies on agents constructing an estimate of the averages by mixing information drawn from local neighbors and making a subsequent relaxed projected-pseudo-gradient step, as in Algorithm 2. To build the estimates $\sigma_{i}$ , $y_{i}$ , $z_{i}$ , at every iteration $k$ , agent $i$ receives $\sigma_{j}^{k}$ ’s, $y_{j}^{k}$ ’s, $z_{j}^{k}$ ’s from its neighbors, $j\in\mathcal{N}_{i}(k)$ , and aligns its intermediate estimates according to the following rules:

[TABLE]

Then, on the basis of $\hat{\sigma}^{k}_{i}$ , $\hat{y}^{k}_{i}$ and $\hat{z}^{k}_{i}$ , agent $i$ updates its strategy $x_{i}^{k+1}$ , its dual variable $\lambda_{i}^{k+1}$ and the new estimates $\sigma_{i}^{k+1},y_{i}^{k+1},z_{i}^{k+1}$ as formalized in Algorithm 3.

Note that the projected-pseudo-gradient updates in Algorithm 3 can be recast in a compact form as

[TABLE]

where

[TABLE]

and $W_{\ell}(k):=W(k)\otimes I_{\ell}$ for some $\ell\in\mathbb{N}$ .

IV Convergence analysis

To prove the convergence of Algorithm 3, we rely on the framework of the inexact Krasnosel’skii–Mann fixed-point iteration [19, Alg. 5.4]. Informally speaking, our goal is to show that the error deriving from the inexactness of the estimates $\sigma_{i}$ ’s, $y_{i}$ ’s and $z_{i}$ ’s vanishes to zero fast enough, in which case, also $(\boldsymbol{x}^{k})_{k\in\mathbb{N}}$ generated by Algorithm 3 globally converges to a v-GNE. Technically, we aim at exploiting [19, Th. 5.5], which establishes convergence of an inexact version of the KM iteration in (18), i.e.,

[TABLE]

when $R$ is nonexpansive and the step-size and error sequences, $(\gamma^{k})_{k\in\mathbb{N}}$ and $(e^{k})_{k\in\mathbb{N}}$ , respectively, satisfy

(C.1)

$\sum_{k=0}^{\infty}\gamma^{k}(1-\gamma^{k})=\infty$ , 2. (C.2)

$\sum_{k=0}^{\infty}\gamma^{k}\left\|e^{k}\right\|<\infty$ .

Note that Algorithm 3 can be written as the KM with errors in (25) where $\boldsymbol{\omega}^{k}=\operatorname{col}(\boldsymbol{x}^{k},\boldsymbol{\lambda}^{k})$ and the error at stage $k$ is

[TABLE]

where $\tilde{\boldsymbol{x}}^{k}_{\text{A2}}$ and $\tilde{\boldsymbol{\lambda}}^{k}_{\text{A2}}$ denote the iterates generated by Algorithm 2 (defined in (21) and (22), respectively). In other words, $e^{k}$ represents the distance between the iterates in the ideal case of full-decision information (i.e., where the agents have an exact knowledge of the averages $\bar{x}^{k}$ , $\bar{d}^{k}$ and $\bar{\lambda}^{k}$ ) and the iterates of Algorithm 3, in which the averages are replaced by the estimates $\hat{\sigma}_{i}^{k}$ , $\hat{y}_{i}^{k}$ and $\hat{z}_{i}^{k}$ , built on-line by mixing information drawn from local neighboring agents only.

The main technical challenge to invoke [19, Th. 5.5] and, in turn, prove the convergence of Algorithm 3 is to find a step-size sequence $(\gamma^{k})_{k\in\mathbb{N}}$ , that complies with (C.1), such that the relaxed error sequence $(\gamma^{k}\|e^{k}\|)_{k\in\mathbb{N}}$ satisfies (C.2). We immediately note that if $(\gamma^{k})_{k\in\mathbb{N}}$ is chosen as in Assumptions 7, then it already satisfies (C.1). In the following subsection, we show that (C.2) is also satisfied.

IV-A Analysis of the relaxed error sequence

In the next lemma, we recall a fundamental invariance property of dynamic tracking, namely, at each stage $k$ , the averages among the estimates $\sigma^{k}_{i}$ ’s, $y^{k}_{i}$ ’s, and $z^{k}_{i}$ ’s are equivalent to the correspondent averages we aim to track.

Lemma 4

Let Assumption 3 hold true and set the initial conditions $\sigma^{0}_{i},y_{i}^{0},z_{i}^{0}$ as in Algorithm 3, for all $i\in\mathcal{I}$ . Then, the following equations hold for all $k\geq 0$ :

(i)

$\bar{\sigma}^{k}=\frac{1}{N}\sum_{i=1}^{N}\sigma_{i}^{k}=\bar{x}^{k}$ ; 2. (ii)

$\bar{y}^{k}=\frac{1}{N}\sum_{i=1}^{N}y_{i}^{k}\textstyle=\bar{d}^{k}$ ; 3. (iii)

$\bar{z}^{k}=\frac{1}{N}\sum_{i=1}^{N}z_{i}^{k}\textstyle=\bar{\lambda}^{k}$ . $\square$

Proof:

See Appendix -D. ∎

The following assumption on the dual sequences generated by Algorithm 3 is instrumental for the subsequent lemma.

Assumption 8

The sequence $(\boldsymbol{\lambda}^{k})_{k\in\mathbb{N}}$ generated by Algorithm 3 is bounded, i.e., there exists $B_{D}>0$ such that $\|\boldsymbol{\lambda}^{k}\|\leq B_{D}$ , for all $k\geq 0$ . $\square$

For example, in the context of distributed constrained optimization, Assumption 8 can be enforced by changing the local dual updates by projecting onto a local bounded set $D_{\text{i}}$ that contains the optimal dual set $D^{*}$ [29], [30]. See Section II-C for a discussion on how to locally build such supersets.

The next lemma provides upper bounds for the estimation errors at each stage $k$ of Algorithm 3.

Lemma 5

Let Assumptions 1-3, 8 hold true. Then, there exist some positive constants $B_{\Omega}$ , $B_{D}$ , $B_{Y}$ , $\delta_{1}$ and $\delta_{2}$ and a vanishing scalar sequence $(\phi^{k})_{k\in\mathbb{N}}$ defined as

[TABLE]

with $\rho$ as in (7) and $(\gamma^{k})_{k\in\mathbb{N}}$ as in Assumption 7, such that the following upper bounds hold for all $k\in\mathbb{N}$ :

(i)

$\|\hat{\boldsymbol{\sigma}}^{k}-\mathbf{1}\otimes\bar{x}^{k}\|\textstyle\leq\theta B_{\Omega}\rho^{k}+\theta B_{\Omega}\sum_{s=1}^{k}\rho^{k-s}\gamma^{s-1}$ ; 2. (ii)

$\|\hat{\boldsymbol{z}}^{k}-\mathbf{1}\otimes\bar{\lambda}^{k}\|\textstyle\leq\theta B_{D}\rho^{k}+\theta B_{D}\sum_{s=1}^{k}\rho^{k-s}\gamma^{s-1}$ ; 3. (iii)

$\|\boldsymbol{y}^{k+1}-\mathbf{1}\otimes\bar{d}^{k}\|\textstyle\leq\theta B_{Y}\rho^{k}+\sum_{s=1}^{k}\rho^{k-s}\phi^{s-1}+\phi^{k}$ .

Proof:

See Appendix -E. ∎

By exploiting the upper bounds in Lemma 5 and a result on the convergence of scalar sequences, which is recalled next, we can show that the estimates asymptotically converge to their correspondent aggregate true values.

Lemma 6 ([31, Lemma 3.1])

Let $(\delta^{k})_{k\in\mathbb{N}}$ be a sequence.

(a)

If $\lim_{k\rightarrow\infty}\delta^{k}=\delta$ and $0<\tau<1$ , then $\lim_{k\rightarrow\infty}\sum_{\ell=0}^{k}\tau^{k-\ell}\delta^{\ell}=\delta/(1-\tau)$ . 2. (b)

If $\delta^{k}\geq 0$ for all $k$ , $\sum_{k=0}^{\infty}\delta^{k}<\infty$ and $0<\tau<1$ , then $\sum_{k=0}^{\infty}\sum_{\ell=0}^{k}\tau^{k-\ell}\delta^{\ell}<\infty$ . $\square$

Proposition 3

Let Assumptions 1-3 hold true. Then, the following statements hold:

(i)

$\lim_{k\rightarrow\infty}\|\hat{\boldsymbol{\sigma}}^{k}-\mathbf{1}\otimes\bar{x}^{k}\|=0$ ; 2. (ii)

$\lim_{k\rightarrow\infty}\|\hat{\boldsymbol{z}}^{k}-\mathbf{1}\otimes\bar{\lambda}^{k}\|=0$ ; 3. (iii)

$\lim_{k\rightarrow\infty}\|\boldsymbol{y}^{k+1}-\mathbf{1}\otimes\bar{d}^{k}\|=0$ . $\square$

Proof:

(i) From the upper bound in Lemma 5 (i), we have

[TABLE]

where $\lim_{k\rightarrow\infty}\rho^{k}=0$ , since $0<\rho<1$ by Lemma 1, and $\lim_{k\rightarrow\infty}\sum_{s=1}^{k}\rho^{k-s}\gamma^{s-1}=0$ by Lemma 6 (a), since $0<\rho<1$ and $\lim_{k\rightarrow\infty}\gamma^{k}=0$ by Assumption 7. Hence, $\lim_{k\rightarrow\infty}\|\hat{\boldsymbol{\sigma}}^{k}-\mathbf{1}\otimes\bar{x}^{k}\|=0$ . The proofs of (ii) and (iii) are analogous. ∎

Next, we derive an upper bound for the error $e^{k}$ in (26) that directly depends on the estimation errors in Lemma 5.

Lemma 7

Let Assumptions 1-3, 8 hold true. Then, the following bound holds for all $k\in\mathbb{N}$ :

[TABLE]

Proof:

See Appendix -F. ∎

Finally, by combining the upper bounds in Lemma 5 and 7 and exploiting a result on the convergence of scalar sequences, i.e., Lemma 6 (b), we show that condition (C.2) holds, namely, the relaxed error sequence $(\gamma^{k}\|e^{k}\|)_{k\in\mathbb{N}}$ is summable.

Lemma 8

Let Assumptions 1-3, 8 hold true. The sequence $(\gamma^{k}\|e^{k}\|)_{k\in\mathbb{N}}$ , with $e^{k}$ as in (26), is summable, i.e.,

[TABLE]

Proof:

See Appendix -G. ∎

Now, we can prove the convergence of Algorithm 3.

Theorem 2

Let Assumptions 1-5, 8 hold true, the step sizes $\{\alpha_{i},\beta_{i}\}_{i\in\mathcal{I}}$ be set as in Assumption 6, and $(\gamma^{k})_{k\in\mathbb{N}}$ as in Assumption 7. Then, the sequence $(\operatorname{col}(\boldsymbol{x}^{k},\boldsymbol{\lambda}^{k}))_{k\in\mathbb{N}}$ generated by Algorithm 3 globally converges to some $\operatorname{col}(\boldsymbol{x}^{*},\boldsymbol{\lambda}^{*})\in\operatorname{zer}(T)$ , where $\boldsymbol{x}^{*}$ is a v-GNE of the game in (4). $\square$

Proof:

For all $k\in\mathbb{N}$ , the iterations of Algorithm 3 can be cast as the Krasnosel’skii–Mann process with errors $\boldsymbol{\omega}^{k+1}=\boldsymbol{\omega}^{k}+\gamma^{k}(R(\boldsymbol{\omega}^{k})+e^{k}-\boldsymbol{\omega}^{k})$ , where $\boldsymbol{\omega}^{k}=\operatorname{col}(\boldsymbol{x}^{k},\boldsymbol{\lambda}^{k})$ , $R$ as in (19) and $e^{k}$ as in (26). By [19, Th. 5.5], the sequence $(\boldsymbol{\omega}^{k})_{k\in\mathbb{N}}$ converges to some $\boldsymbol{\omega}^{*}\in\mathrm{fix}(R)$ , since $R$ is averaged, thus nonexpansive, by Lemma 11, and (C.1) $-$ (C.2) hold, by Assumption 7 and Lemma 8, respectively. To conclude, we note that $\boldsymbol{\omega}^{*}\in\mathrm{fix}(R)=\operatorname{zer}(\Phi^{-1}T_{1}+\Phi^{-1}T_{2})$ , by [11, Prop. 25.1 (iv)], and that $\operatorname{zer}(\Phi^{-1}T_{1}+\Phi^{-1}T_{2})=\operatorname{zer}(T)\neq\varnothing$ , with $T$ as in (14), since $\Phi\succ 0$ , by Lemma 9, and $T_{1}+T_{2}=T$ . Since $\boldsymbol{\omega}^{*}\in\operatorname{zer}(T)$ , then $\boldsymbol{x}^{*}$ is a v-GNE of the game in (4), by Proposition 2 (ii). ∎

V Numerical simulations

In this section, we study the performance of the proposed algorithm on a class of network Nash–Cournot games with market capacity constraints. Such games represent an instance of generalized aggregative Nash games. In Section V-A, we describe the player cost functions and strategy sets and verify that the necessary assumptions are satisfied. In Section V-B, we compare the performance of our algorithm against a standard semi-decentralized method (Algorithm 1).

V-A Generalized network Nash–Cournot game

We extend the network Nash–Cournot game model proposed in [14, §IV] with additional market capacity constraints. Specifically, consider $N$ firms that compete over $m$ markets. Let firm $i$ ’s production and sales at location $l$ be denoted by $g_{i,l}$ and $s_{i,l}$ , respectively, while its cost of production at location $l$ is denoted by $f_{i,l}(g_{i,l})$ and defined as follows:

[TABLE]

where $a_{i,l}$ and $b_{i,l}$ are scaling parameters for agent $i$ .

The goods sold by firm $i$ at location $l$ fetch a revenue $p(\bar{s}_{l})s_{i,l}$ , where $p(\bar{s}_{l})$ denote the sales price at location $l$ and $\bar{s}_{l}=\sum_{i=1}^{N}s_{i,l}$ represents the aggregate sales at location $l$ . The market price is set according to an inverse demand function which depends on the aggregate of the network, i.e.,

[TABLE]

where $d_{l}$ is the overall demand for location $l$ . Each firm $i$ has a production limitation at location $l$ , described by $u_{i,l}$ . Moreover, the overall production in each market $l$ must meet the correspondent demand $d_{l}$ and do not exceed a maximum capacity $r_{l}$ . Hence, the coupling constraints $d_{l}\leq\sum_{i=1}^{N}g_{i,l}\leq r_{l}$ , for all $l=1,2,\ldots,m$ , have to be satisfied.

Overall, each firm $i$ , given the strategies of the other firms, aims at solving the following optimization problem:

[TABLE]

Effectively, the payoff function of firm $i$ is parametrized by nodal aggregate sales and its constraints depend on the other firms’ strategies, thus leading to a generalized aggregative game. In this example, we assume that the firms communicate over a dynamic network to cope with the lack of aggregate information, which is necessary to compute their optimal production and sale strategies.

Next, we show that the proposed network Nash–Cournot game does satisfy our technical setup. Let $x_{i}=\operatorname{col}(g_{i,1},\ldots,g_{i,m},s_{i,1},\ldots,s_{i,m})\in\mathbb{R}^{2m}$ denote the strategy vector of agent $i$ and $\boldsymbol{x}=\operatorname{col}(x_{1},\ldots,x_{N})$ denote the collective strategy profile. The cost function of agent $i$ is quadratic, convex in $x_{i}$ , continuously differentiable and can be cast in a compact form as

[TABLE]

where $A_{i}:=\operatorname{diag}(a_{i,1},\ldots,a_{i,m},0,\ldots,0)$ , $\Delta=\operatorname{diag}(\mathbf{0},I_{m})$ and $b_{i}:=\operatorname{col}(b_{i,1},\ldots,b_{i,m},-d_{1},\ldots,-d_{n})$ . The local feasible set of firm $i$ is non-empty (for an adequate choice of $u_{i,l}$ ’s ), convex, compact and reads as $\Omega_{i}:=\{x_{i}\in\mathbb{R}^{2n}\,|\,\sum_{l=1}^{n}g_{i,l}\geq\sum_{l=1}^{n}s_{i,l},\;g_{i,j},s_{i,j}\geq 0,\;g_{i,l}\leq u_{i,l},\;l=1,\ldots,m,\}$ .

The coupling constraints are affine and can be written in compact form as in (4), with $C_{i}=\left[\begin{smallmatrix}\mathbf{0}&I_{m}\\ \mathbf{0}&-I_{m}\end{smallmatrix}\right]$ and $c_{i}=\frac{1}{N}\operatorname{col}(r_{1},\ldots,r_{m},-d_{1},\ldots,-d_{m})$ , for all $i\in\mathcal{I}$ . Thus, Assumption 1 is satisfied.

The pseudo gradient mapping $F$ is affine and reads as

[TABLE]

with

[TABLE]

$A=\textrm{blkdiag}(A_{1},\ldots,A_{N})$ and $b=\operatorname{col}(b_{i},\ldots,b_{N})$ . By a direct inspection of the eigenvalues of $P$ , we can show that $F$ is strongly monotone and Lipschitz continuous, when the coefficients $a_{i,j}$ ’s are positive. Hence, Assumption 4 is satisfied. In particular, it follows by [27, p.79] that $F$ is $\chi-$ cocoercive with $\chi:=\|P\|^{-1}$ . Moreover, since $F$ is strongly monotone and the sets $\Omega_{i}$ are compact, it follows by Remark 2 that there exists a unique v-GNE. The mapping $\tilde{F}$ is affine and reads as

[TABLE]

Similarly, it can be shown that $\boldsymbol{F}$ is $L_{\boldsymbol{F}}-$ Lipschitz continuous with $L_{\boldsymbol{F}}:=\max_{ij}\{a_{i,j},1\}$ . Thus, Assumption 5 is satisfied.

V-B Simulations studies

In our numerical study we consider a network Nash-Cournot game played by $20$ firms, i.e., $N=20$ , over $10$ markets, i.e., $m=10$ . All the parameters of the game are drawn from uniform distributions and fixed over the course of the entire simulations. Specifically, for all $i\in\mathcal{I}$ and $l\in\{1,\ldots,m\}$ , we set the parameters of production cost in (28) as $a_{i,l}\in\mathcal{U}(2,3)$ and $b_{i,l}\in\mathcal{U}(2,12)$ , where $\mathcal{U}(t,\tau)$ denotes the uniform distribution over an interval $[t,\tau]$ with $t<\tau$ . We set the production capacities of firm $i$ as $u_{i,l}\in\mathcal{U}(50,100)$ for all $l\in\{1,\ldots,n\ \}$ and for all $i\in\mathcal{I}$ . Moreover, the demand at market $l$ is set as $d_{l}\in\mathcal{U}(90,100)$ , while the market capacity as $r_{l}\in\mathcal{U}(d_{l},2d_{l})$ for all $l\in\{1,\ldots,m\ \}$ .

At each iteration $k$ , the firms communicate according to a randomly generated and connected small world, where each node has 4 neighbors. To create a doubly stochastic mixing matrix $W(k)$ , we exploit the Metropolis weighting rules in (5). Thus, Assumptions 2 and 3 are satisfied. The agents update their decisions and their estimates as in Algorithm 3. The step-sizes $\{\alpha_{i},\beta_{i}\}_{i\in\mathcal{I}}$ are set according to Assumption 6, where the global parameter $\tau$ is set $5\%$ larger than the theoretical lower bound $\frac{1}{2\delta}$ , where $\delta=\min\{1,\|P\|\}$ and $P$ as in (31).

In Figure 1, we show the trajectories of the sequences of normalized residuals $\|\boldsymbol{x}^{k}-\boldsymbol{x}^{*}\|/\|\boldsymbol{x}^{0}-\boldsymbol{x}^{*}\|$ for different choices of the step-size sequence $(\gamma^{k})_{k\in\mathbb{N}}$ . Moreover, we compare the trajectories of Algorithm 3 with those obtained with Algorithm 1 [26, Alg. 1], which is a semi-decentralized algorithm and works under the assumption of full-decision information, i.e., the firms have access to the real aggregate information at each stage $k$ of the algorithm. As expected, the semi-decentralized algorithm converges faster than the fully-distributed counterpart. Interestingly, we notice that convergence is achieved also in the case of fixed relaxation step in the KM process, e.g. $\gamma^{k}=1$ for all $k\geq 0$ , which is not supported by our theoretical analysis.

In Figure 2, we compare the trajectories of the consensus disagreement of the dual variables $\|(\L\otimes I_{m})\boldsymbol{\lambda}^{k}\|$ for two choices of the step-size sequence $(\gamma^{k})_{k\in\mathbb{N}}$ .

VI Conclusion

For a general class of aggregative games with linear coupling constraints over time-varying communication networks, we have designed the first single-layer, fully-distributed algorithm to compute a variational generalized Nash equilibrium. Global convergence can be established via monotone-operator-theoretic and fixed-point arguments, integrated with a dynamic tracking methodology.

The analysis approach in this paper is genuinely novel, hence opens up a number of new research directions. Motivated by the numerical results of Section V, it would be valuable to explore the computational aspects of the proposed method, e.g. how the connectivity of the communication networks influences the convergence speed. Whether or not the proposed algorithm converges with fixed step sizes in the Krasnosel’skii-Mann process is currently an open question. Finally, it would be highly valuable to relax the assumption of double-stochasticity of the mixing matrices.

-A Proof of Lemma 3

(i) $T_{2}$ is the sum of two terms: $S$ in (20) which is a linear, skew symmetric mapping, thus maximally monotone [11, Ex. 20.30]; and $\mathrm{N}_{\boldsymbol{\Omega}}\times\mathrm{N}_{\mathbb{R}^{mN}_{\geq 0}}$ which is maximally monotone since is the direct sum of maximally monotone operators [11, Prop. 20.23] (i.e, the normal cones of the closed convex sets $\boldsymbol{\Omega}$ and $\mathbb{R}^{mN}_{\geq 0}$ ). Hence, the maximal monotonicity of $S+\mathrm{N}_{\boldsymbol{\Omega}}\times\mathrm{N}_{\mathbb{R}^{mN}_{\geq 0}}=T_{2}$ follows by [11, Cor. 24.4 (i)] since $\operatorname{dom}(S)=\mathbb{R}^{(n+m)N}$ .

(ii) $F$ is $\chi-$ cocoercive, by Assumption 4, and $\L_{m}$ is $1-$ cocoercive by [24, p.79], since $\L_{m}$ is a linear, positive semi-definite mapping with $\|\L_{m}\|=1$ . It follows that the direct sum $T_{1}(\cdot)=F(\cdot)\times(\L_{m}\cdot+\frac{1}{N}c_{\text{f}})$ is $\delta-$ cocoercive, for all $\delta$ such that $0<\delta\leq\min\{1,\chi\}$ . Now, we show that $T_{1}$ is restricted-strictly monotone w.r.t. $\Theta^{\parallel}=\boldsymbol{\Omega}\times\boldsymbol{E}^{\parallel}$ . Let us recall that $\boldsymbol{E}^{\parallel}$ and $\boldsymbol{E}^{\perp}$ are the ( $m$ -dimensional) consensus and disagreement subspaces, respectively. Moreover, each vector $v\in\mathbb{R}^{m}$ , can be split as $v=v_{\parallel}+v_{\perp}$ , with $v_{\parallel}\in\boldsymbol{E}^{\parallel}$ and $v_{\perp}\in\boldsymbol{E}^{\perp}$ . Consider now $\boldsymbol{\omega}=\operatorname{col}(\boldsymbol{x},\boldsymbol{\lambda})\not\in\Theta^{\parallel}$ , hence $\boldsymbol{\lambda}=\boldsymbol{\lambda}_{\parallel}+\boldsymbol{\lambda}_{\perp}$ , with $\boldsymbol{\lambda}_{\parallel}\in\boldsymbol{E}^{\parallel}$ and $\mathbf{0}\neq\boldsymbol{\lambda}_{\perp}\in\boldsymbol{E}^{\perp}$ . Let $\boldsymbol{\omega}^{\prime}=\operatorname{col}(\boldsymbol{x}^{\prime},\boldsymbol{\lambda}^{\prime})\in\Theta^{\parallel}$ , hence $\boldsymbol{\lambda}^{\prime}=\boldsymbol{\lambda}^{\prime}_{\parallel}\in\boldsymbol{E}^{\parallel}$ and $\boldsymbol{\lambda}^{\prime}_{\perp}=\mathbf{0}$ . The following inequalities show that $T_{1}$ in (16) is restricted-strictly monotone w.r.t. $\Theta^{\parallel}$ :

[TABLE]

where $\L_{m}=(\L\otimes I_{m})$ , with $\L$ projection onto the disagreement subspace and $\text{eig}_{2}(\L)=1$ is the second smallest eigenvalue of $\L=I-\frac{1}{N}\mathbf{1}\mathbf{1}^{\top}$ . The first inequality follows by the cocoercivity of $F$ (Assumption 4) and since $\L_{m}\boldsymbol{\lambda}_{\parallel}=\L_{m}\boldsymbol{\lambda}^{\prime}=0$ , namely, the projection onto the disagreement subspace of the consensual terms is zero.

(iii): The maximal monotonicity of $T=T_{1}+T_{2}$ follows by [11, Cor. 24.4 (i)], since $T_{1}$ is cocoercive (thus maximally monotone [11, Example 20.31]), $T_{2}$ is maximally monotone and $\operatorname{dom}(T_{1})=\mathbb{R}^{(n+m)N}$ . Moreover, since $T_{1}$ is also restricted-strictly monotone with respect to $\Theta_{\parallel}$ then $T$ enjoys the same property. $\blacksquare$

-B Proof of Proposition 2

(i) By Proposition 1, there exists $\lambda^{*}\in\mathbb{R}^{m}_{\geq 0}$ such that $\operatorname{col}(\boldsymbol{x}^{*},\lambda^{*})\in\operatorname{zer}(U)$ , where $\boldsymbol{x}^{*}$ is a v-GNE. Define $\boldsymbol{\omega}^{*}=\operatorname{col}(\boldsymbol{x}^{*},\boldsymbol{\lambda}^{*})$ , with $\boldsymbol{\lambda}^{*}=\mathbf{1}_{N}\otimes\lambda^{*}$ , then we have $T(\boldsymbol{\omega}^{*})\ni\mathbf{0}$ . In fact, each component of the first row block of $T(\boldsymbol{\omega}^{*})$ reads as $\mathrm{N}_{\Omega_{i}}({x}^{*}_{i})+\nabla_{x_{i}}J_{i}(x_{i}^{*},\bar{x}^{*})+C_{i}^{\top}\lambda^{*}\ni\mathbf{0}$ . While, each component of the second row block of $T(\boldsymbol{\omega}^{*})$ reads as $\mathrm{N}_{\mathbb{R}^{m}_{\geq 0}}(\lambda^{*})-\frac{1}{N}(C\boldsymbol{x}-c)\ni\mathbf{0}$ , since $\mathrm{N}_{\mathbb{R}^{m}_{\geq 0}}(\lambda^{*})-(C\boldsymbol{x}^{*}-c)\ni\mathbf{0}$ and $\frac{1}{N}\mathrm{N}_{\mathbb{R}_{\geq 0}^{m}}=\mathrm{N}_{\mathbb{R}_{\geq 0}^{m}}$ . Hence, $\operatorname{zer}(T)\neq\varnothing$ .

(ii) From the first part of the proof, we know that there exists $\boldsymbol{\omega}^{*}\in\Theta^{\parallel}$ such that $\boldsymbol{\omega}^{*}\in\operatorname{zer}(T)$ . Now, we show that all the zeros of $T$ lie in $\Theta^{\parallel}$ . By contradiction, let $\boldsymbol{\omega}^{\prime}\in\operatorname{zer}(T)$ and assume $\boldsymbol{\omega}^{\prime}\notin\Theta^{\parallel}$ . Then, $\mathbf{0}\in T(\boldsymbol{\omega}^{*})$ , $\mathbf{0}\in T(\boldsymbol{\omega}^{\prime})$ and Lemma 3 (iii) yields $0=(\mathbf{0}-\mathbf{0})^{\top}(\boldsymbol{\omega}^{*}-\boldsymbol{\omega}^{\prime})>0$ , which is impossible. Therefore, $\boldsymbol{\omega}^{\prime}\in\Theta^{\parallel}$ , namely $\boldsymbol{\omega}^{\prime}=\operatorname{col}(\boldsymbol{x}^{\prime},\mathbf{1}\otimes\lambda^{\prime})$ . Now, by substituting $\boldsymbol{\omega}^{\prime}$ into $T$ (since $(\L\otimes I_{m})(\mathbf{1}\otimes\lambda^{\prime})=\mathbf{0}$ ) we recover that $\boldsymbol{\omega}^{\prime}\in\operatorname{zer}(T)\Rightarrow\operatorname{col}(\boldsymbol{x}^{\prime},\lambda^{\prime})\in\operatorname{zer}(U)$ , which, by Proposition 1, holds if and only if $\boldsymbol{x}^{\prime}$ is a v-GNE. $\blacksquare$

-C Proof of Theorem 1

To prove convergence of Algorithm 2 we follow the same technical reasoning of the proof in [10, Alg. 1]. Specifically, the proof is divided in two parts to show that:

(1)

Algorithm 2 corresponds to the fixed-point iteration in (18), i.e., $\boldsymbol{\omega}^{k+1}=\boldsymbol{\omega}^{k}+\gamma^{k}(R(\boldsymbol{\omega}^{k})-\boldsymbol{\omega}^{k})$ , where $R:=(\mathrm{Id}+\Phi^{-1}T_{2})^{-1}\circ(\mathrm{Id}-\Phi^{-1}T_{1})$ is the so-called pFB operator. 2. (2)

If the step sizes are set as in Assumption 6, then $R$ is an averaged operator. Hence, (18) globally converges to some $\boldsymbol{\omega}^{*}:=\operatorname{col}(\boldsymbol{x}^{*}\boldsymbol{\lambda}^{*})\in\mathrm{fix}(R)$ . Since $\mathrm{fix}(R)=\operatorname{zer}(T)$ , with $T$ as in (14), then $\boldsymbol{x}^{*}$ is a v-GNE, by Proposition 2.

(1): Let us recast Algorithm 2 in a compact form as

[TABLE]

Since $\mathrm{proj}_{\boldsymbol{\Omega}}=(\mathrm{Id}+\mathrm{N}_{\boldsymbol{\Omega}})^{-1}$ , $\boldsymbol{F}(\boldsymbol{x}^{k},\mathbf{1}\otimes\bar{x}^{k})=F(\boldsymbol{x}^{k})$ and $C^{\top}_{\text{d}}\bar{\boldsymbol{\lambda}}^{k}=C_{\text{d}}^{\top}(\mathbf{1}\otimes\bar{\lambda}^{k})=\frac{1}{N}C_{\text{f}}^{\top}\boldsymbol{\lambda}^{k}$ , it follows from (32) that $(\mathrm{Id}+\mathrm{N}_{\boldsymbol{\Omega}})(\tilde{\boldsymbol{x}}^{k})\ni\boldsymbol{x}^{k}-\alpha_{\text{d}}(F(\boldsymbol{x}^{k})+\frac{1}{N}C_{\text{f}}^{\top}\boldsymbol{\lambda}^{k})$ , which leads to

[TABLE]

where we used $\alpha_{\text{d}}^{-1}\mathrm{N}_{\boldsymbol{\Omega}}(\tilde{\boldsymbol{x}}^{k})=\mathrm{N}_{\boldsymbol{\Omega}}(\tilde{\boldsymbol{x}}^{k})$ . Similarly, since $\mathbf{1}\otimes\bar{d}^{k}=\frac{1}{N}(2C_{\text{f}}\tilde{\boldsymbol{x}}^{k}-C_{\text{f}}\boldsymbol{x}^{k}-c_{\text{f}})$ and $\boldsymbol{\lambda}^{k}-\bar{\boldsymbol{\lambda}}^{k}=((I-\frac{1}{N}\mathbf{1}\mathbf{1}^{\top})\otimes I_{m})\boldsymbol{\lambda}^{k}=(\L\otimes I_{m})\boldsymbol{\lambda}^{k}=\L_{m}\boldsymbol{\lambda}^{k}$ , it follows from (33) that $(\mathrm{Id}+\mathrm{N}_{\mathbb{R}^{mN}_{\geq 0}})(\tilde{\boldsymbol{\lambda}}^{k})\in\boldsymbol{\lambda}^{k}+\beta_{\text{d}}(\frac{1}{N}(2C_{\text{f}}\tilde{\boldsymbol{x}}^{k}-C_{\text{f}}\boldsymbol{x}^{k}-c_{\text{f}})-\L_{m}\boldsymbol{\lambda}^{k})$ , which leads to

[TABLE]

Let $\boldsymbol{\omega}^{k}:=\operatorname{col}(\boldsymbol{x}^{k},\boldsymbol{\lambda}^{k})$ , then the inclusions in (36) $-$ (37) can be cast in compact form as

[TABLE]

where $T_{1}$ , $T_{2}$ and $\Phi$ as in (15), (16) and (20), respectively. By making $\tilde{\boldsymbol{\omega}}^{k}$ explicit in (38), we obtain

[TABLE]

which corresponds to $\tilde{\boldsymbol{\omega}}^{k}=R(\boldsymbol{\omega}^{k})$ , where $R$ is the pFB operator in (19). Finally, it follows by (34) $-$ (35) that $\boldsymbol{\omega}^{k+1}=\boldsymbol{\omega}^{k}+\gamma^{k}(R(\boldsymbol{\omega}^{k})-\boldsymbol{\omega}^{k})$ , which concludes the proof.

(2): Next, we introduce some technical statements that we exploit later on in this proof.

Lemma 9

Let the step-sizes $\{\alpha_{i},\beta_{i}\}_{i\in\mathcal{I}}$ satisfy Assumption 6. Then the following statements hold:

(i)

$\Phi-\tau I\succeq 0$ , with $\tau$ as in Assumption 6, 2. (ii)

$\|\Phi^{-1}\|\leq\tau^{-1}$ . $\square$

Proof:

(i): By the generalized Gershgorin circular theorem [32, Th. 2], each eigenvalue $\mu$ of the matrix $\Phi$ in (20) satisfies at least one of the following inequalities:

[TABLE]

Hence, if we set the step-sizes $\alpha_{i},\beta_{i}$ as in Assumption 6, the inequalities (40)-(41) yield to $\mu\geq\tau$ . It follows that the smallest eigenvalue of $\Phi$ , i.e., $\mu_{\min}(\Phi)$ , satisfies $\mu_{\min}(\Phi)\geq\tau>0$ . Thus, $\Phi-\tau I$ is positive semi-definite.

(ii): Let $\mu_{\max}(\Phi)$ be the largest eigenvalue of $\Phi$ . We have that $\mu_{\max}(\Phi)\geq\mu_{\min}(\Phi)\geq\tau$ . Moreover, $\|\Phi\|=\mu_{\max}(\Phi)\geq\mu_{\min}(\Phi)=\frac{1}{\|\Phi^{-1}\|}\geq\tau$ . Hence $\|\Phi^{-1}\|\leq\tau^{-1}$ . ∎

Lemma 10

Let Assumptions 1 and 4 hold and the step-sizes $\{\alpha_{i},\beta_{i}\}_{i\in\mathcal{I}}$ satisfy Assumption 6. The following properties hold in the $\Phi$ -induced norm (i.e., $\|\cdot\|_{\Phi}$ ):

(i)

$\Phi^{-1}T_{1}$ * is * $\delta\tau-$ *cocoercive and $(\mathrm{Id}-\Phi^{-1}T_{1})$ is * $\frac{1}{2\delta\tau}-$ averaged; 2. (ii)

$\Phi^{-1}T_{2}$ * is maximally monotone and $(\mathrm{Id}-\Phi^{-1}T_{2})^{-1}$ is * $\frac{1}{2}-$ averaged. $\square$

Proof:

(i): Since $T_{1}$ is single-valued and $\Phi^{-1}$ nonsingular, by Lemma 9 (i), for each $\boldsymbol{\omega},\boldsymbol{\omega}^{\prime}\in\boldsymbol{\Omega}\times\mathbb{R}^{nN}_{\geq 0}$

[TABLE]

where the last inequality follows by Lemma 9 (ii). By (42) and the $\delta-$ cocoercivity of $T_{1}$ (Lemma 3 (ii))

[TABLE]

In other words, $\Phi^{-1}T_{1}$ is $\delta\tau-$ cocoercive in the $\Phi-$ induced norm. It follows from [11, Prop. 4.33] that $(\mathrm{Id}-\Phi^{-1}T_{1})$ is $\frac{1}{2\delta\tau}-$ averaged in the $\Phi-$ induced norm.

(ii): $\Phi^{-1}T_{2}$ is maximally monotone in the $\Phi-$ induced norm, since $T_{2}$ is maximally monotone by Lemma 3 (i). By [11, Prop. 23.7], the resolvent mapping $(\mathrm{Id}+\Phi^{-1}T_{2})$ is $\frac{1}{2}-$ averaged (or firmly-nonexpansive, see [11, Remark 4.24]) in the $\Phi-$ induced norm, since $\Phi^{-1}T_{2}$ is maximally monotone in the same norm. ∎

Lemma 11

Let Assumptions 1, 4 hold and the step-sizes $\{\alpha_{i},\beta_{i}\}_{i\in\mathcal{I}}$ satisfy Assumption 6. Then, the pFB operator $R=(\mathrm{Id}+\Phi^{-1}T_{2})^{-1}\circ(\mathrm{Id}-\Phi^{-1}T_{1})$ is $\nu-$ averaged in the $\Phi-$ induced norm (i.e., $\|\cdot\|_{\Phi}$ ), with $\nu:=\frac{2\delta\tau}{4\delta\tau-1}\in(\frac{1}{2},1)$ . $\square$

Proof:

By [11, Proposition 4.4] , the mapping $R$ is $\left(\frac{2\delta\tau}{4\delta\tau-1}\right)-$ averaged with respect to $\|\cdot\|_{\Phi}$ , since composition of $(\mathrm{Id}+\Phi^{-1}T_{2})^{-1}$ and $(\mathrm{Id}-\Phi^{-1}T_{1})$ which are $\frac{1}{2}-$ and $\frac{1}{2\delta\tau}-$ averaged in $\|\cdot\|_{\Phi}$ , respectively, by Lemma 10. Moreover, $\frac{2\delta\tau}{4\delta\tau-1}\in(\frac{1}{2},1)$ , since $\tau>\frac{1}{2\delta}$ , by Assumption 6. ∎

The fixed-point iteration (18), that corresponds to Algorithm 2 by the first part of this proof, is the Krasnosel’skii-Mann iteration on the mapping $R$ , which is $\nu-$ averaged, with $\nu\in(\frac{1}{2},1)$ , by Lemma 11. The convergence of (18) to some $\boldsymbol{\omega}^{*}:=\operatorname{col}(\boldsymbol{x}^{*},\boldsymbol{\lambda}^{*})\in\mathrm{fix}(R)$ follows by [11, Prop. 5.15]. To conclude, we note that $\boldsymbol{\omega}^{*}\in\mathrm{fix}(R)=\operatorname{zer}(\Phi^{-1}T_{1}+\Phi^{-1}T_{2})$ , by [11, Prop. 25.1 (iv)], and that $\operatorname{zer}(\Phi^{-1}T_{1}+\Phi^{-1}T_{2})=\operatorname{zer}(T)$ , with $T$ as in (14), since $\Phi\succ 0$ , by Lemma 9 (i), and $T_{1}+T_{2}=T$ . Since the limit point $\boldsymbol{\omega}^{*}\in\operatorname{zer}(T)\neq\varnothing$ , by Proposition 2 (i), then $\boldsymbol{x}^{*}$ is a v-GNE of the game in (4), by Proposition 2 (ii), thus concluding the proof. $\blacksquare$

-D Proof of Lemma 4

We prove equation (i) by induction. At step zero, $\bar{\sigma}^{0}=\bar{x}^{0}$ holds if the estimates are initialized as $\sigma_{i}^{0}=x_{i}^{0}$ , for all $i\in\mathcal{I}$ . At step $k$ , we assume that $\bar{\sigma}^{k}=\bar{x}^{k}$ . To conclude the proof, we show that relation (i) holds at step $k+1$ :

[TABLE]

The first equality follows from the updating rule of the $\sigma_{i}$ ’s in Algorithm 3, the second follows by definition of $\bar{x}^{k}$ , i.e., $\bar{x}^{k}=\frac{1}{N}(\mathbf{1}^{\top}\otimes I_{n})\boldsymbol{x}^{k}$ , the third follows since the mixing matrix $W(k)$ is column stochastic, i.e., $\mathbf{1}^{\top}W(k)=\mathbf{1}^{\top}$ , by Assumption 3, while the last equality follows from the induction step $k$ , i.e., $\bar{\sigma}^{k}=\bar{x}^{k}$ . The proof of equations (ii) and (iii) are analogous.

-E Proof of Lemma 5

For easy of notation, this proof is developed for the scalar case, i.e., $n=m=1$ . In this case, we can write $\|\hat{\boldsymbol{\sigma}}^{k}-\mathbf{1}\otimes\bar{x}^{k}\|=\|(W(k)\otimes I_{n})\boldsymbol{\sigma}^{k}-\mathbf{1}\otimes\bar{x}^{k}\|=\|W(k)\boldsymbol{\sigma}^{k}-\bar{x}^{k}\mathbf{1}\|$ .

(i): The update of the estimates $\sigma_{i}$ ’s in Algorithm 3 can be written in a compact form as

[TABLE]

By telescoping (44), we obtain

[TABLE]

where the transition matrices $\Psi(\cdot,\cdot)$ ’s are defined in (6). By rearranging (44), we can write $W(k)\boldsymbol{\sigma}^{k}=\boldsymbol{\sigma}^{k+1}-\boldsymbol{x}^{k+1}+\boldsymbol{x}^{k}$ . Then, by exploiting the equivalence in (45), we have

[TABLE]

Now, consider $\bar{\sigma}^{k}$ , which may be written as follows:

[TABLE]

By Lemma 4, we have that $\bar{\sigma}^{s}=\bar{x}^{s}$ $\forall s\geq 0$ , which leads to

[TABLE]

From equations (46) and (47), we have the following:

[TABLE]

where (a) follows from the Cauchy–Schwarz inequality, while (b) since $\|\Psi(k,s)-\frac{1}{N}\mathbf{1}\mathbf{1}^{\top}\|\leq\theta\rho^{k-s}$ for all $k\geq s\geq 0$ , by Lemma 1. Next, we find an upper bound for $\|\boldsymbol{x}^{s}-\boldsymbol{x}^{s-1}\|$ in (48). The update of the decisions $x_{i}$ ’s can be written in a compact form as $\boldsymbol{x}^{k+1}=\boldsymbol{x}^{k}+\gamma^{k}(\tilde{\boldsymbol{x}}^{k}-\boldsymbol{x}^{k})$ . We note that $\tilde{x}_{i}^{k},x_{i}^{k}\in\Omega_{i}$ , for all $k\geq 0$ since $\tilde{x}_{i}^{k}$ is obtained by projecting onto $\Omega_{i}$ and $x_{i}^{k}=(1-\gamma^{k})x_{i}^{k-1}+\gamma^{k}\tilde{x}_{i}^{k}$ is a convex combination of elements of the convex set $\Omega_{i}$ . Since all the sets $\Omega_{i}$ ’s are compact, by Assumption 1, it follows that for some constant $B_{\Omega}$ , we have

[TABLE]

By combining (49) and (48), we obtain

[TABLE]

where we exploited the initialization step of Algorithm 3, i.e., $\boldsymbol{\sigma}^{0}=\boldsymbol{x}^{0}\in\boldsymbol{\Omega}$ , from which $\|\boldsymbol{\sigma}^{0}\|\leq B_{\Omega}$ .

(ii): The update of the estimates $z_{i}$ ’s in Algorithm 3 can be written in a compact form as

[TABLE]

By telescoping (50), we obtain

[TABLE]

To upper bound $\|\boldsymbol{\lambda}^{s}-\boldsymbol{\lambda}^{s-1}\|$ , we note that the dual update in Alg. 3 reads in compact form as $\boldsymbol{\lambda}^{s}=\boldsymbol{\lambda}^{s-1}+\gamma^{s}(\tilde{\boldsymbol{\lambda}}^{s-1}-\boldsymbol{\lambda}^{s-1})$ and that the dual sequence $(\boldsymbol{\lambda}^{s})_{s\in\mathbb{N}}$ is positive and $B_{D}-$ norm bounded, by Assumption 8. Hence, we have $\|\boldsymbol{\lambda}^{s}-\boldsymbol{\lambda}^{s-1}\|\leq\gamma^{s-1}B_{D}$ , that substituted into (51) gives

[TABLE]

(iii): The update of the estimates $y_{i}$ ’s in Algorithm 3 can be written in a compact form as

[TABLE]

By telescoping (52) (as explained in (45)), we obtain

[TABLE]

Now, consider $\bar{y}^{k}$ , which may be written as follows:

[TABLE]

By Lemma 4, we have that $\bar{y}^{s}=\bar{d}^{s}=\frac{1}{N}\mathbf{1}^{\top}C_{\text{d}}(2\tilde{\boldsymbol{x}}^{s}-\boldsymbol{x}^{s})-c$ , for all $s\geq 0$ , which leads to

[TABLE]

From the relations (53) and (54), we have the following:

[TABLE]

where the first equality, (a), follows by substituting (53) and (54) to $\boldsymbol{y}^{k+1}$ and $\bar{d}^{k}$ , respectively, (b) follows from Cauchy–Schwartz inequality and (c) since $\|\Psi(k,s)-\frac{1}{N}\mathbf{1}\mathbf{1}^{\top}\|\leq\theta\rho^{k-s}$ for all $k\geq s\geq 0$ , by Lemma 1. Now we build an upper bound for $\|(2\tilde{\boldsymbol{x}}^{s}-\boldsymbol{x}^{s})-(2\tilde{\boldsymbol{x}}^{s-1}-\boldsymbol{x}^{s-1})\|$ in (55):

[TABLE]

where (a) follows from the triangular inequality and (b) follows from (49)

Next, we build an upper bound for the term $\|\tilde{\boldsymbol{x}}^{s}-\tilde{\boldsymbol{x}}^{s-1}\|$ in the right hand side of (56).

[TABLE]

where (a) follows by exploiting the compact update of $\tilde{\boldsymbol{x}}^{s}$ in (23), (b) follows by the nonexpansiveness of the projection operator, (c) follows by exploiting, in sequence, the triangular inequality, the Lipschitz continuity of $\boldsymbol{F}$ (Assumption 5), and the Cauchy-Schwartz inequality, finally (d) follows from the relation $\|\left[\begin{smallmatrix}a\\ b\end{smallmatrix}\right]\|=\sqrt{\|a\|^{2}+\|b\|^{2}}\leq\|a\|+\|b\|$ .

Now, we find an upper bound the last two terms in (57).

[TABLE]

where (a) follows since $W(s\!-\!1)\boldsymbol{\sigma}^{s-1}=\boldsymbol{\sigma}^{s}-\boldsymbol{x}^{s}+\boldsymbol{x}^{s-1}$ by (44), (b) from the triangular inequality, (c) by summing and subtracting $\bar{x}^{s}\mathbf{1}$ within the fist term, (d) by the triangular inequality and substituting to $\|\boldsymbol{x}^{s}-\boldsymbol{x}^{s-1}\|$ the bound in (49), (e) by substituting to $\|W(s)\boldsymbol{\sigma}^{s}-\mathbf{1}\bar{x}^{s}\|$ the upper bound derived in Lemma 5 (i) and to $\|\boldsymbol{\sigma}^{s}-\mathbf{1}\bar{x}^{s}\|$ a bound similarly derived, (f) follows by noticing that $\rho^{s}<\rho^{s-1}$ , since $0<\rho<1$ by Assumption 3.

Similarly, for the last addend in (57), we can derive the following bound:

[TABLE]

Finally, by combining (57) with (58) and (59), we obtain an upper bound for $\|\tilde{\boldsymbol{x}}^{s}-\tilde{\boldsymbol{x}}^{s-1}\|$ , i.e.,

[TABLE]

Now, by substituting (60) into (56), we obtain

[TABLE]

where $(\phi^{s})_{s\in\mathbb{N}}$ is the scalar (vanishing) sequence in (27), with $\rho$ is as in (7) and $(\gamma^{k})_{k\in\mathbb{N}}$ as in Assumption 7. Finally, by combining (61) and (55), we obtain the upper bound in Lemma 5 (iii). $\blacksquare$

-F Proof of Lemma 7

From $\|\left[\begin{smallmatrix}a\\ b\end{smallmatrix}\right]\|=\sqrt{\|a\|^{2}+\|b\|^{2}}\leq\|a\|+\|b\|$ , it follows that

[TABLE]

Next, we upper bound $\|\tilde{\boldsymbol{x}}^{k}-\tilde{\boldsymbol{x}}^{k}_{\text{A2}}\|$ , where $\tilde{\boldsymbol{x}}^{k}$ and $\tilde{\boldsymbol{x}}^{k}_{\text{A2}}$ are defined in (23) and (21), respectively.

[TABLE]

where the first inequality (a) follows by the nonexpansivity of the projection operator, and (b) follows by the triangular inequality and the Lipschitz continuity of $\boldsymbol{F}$ (Assumption 5).

Now, consider $\|\tilde{\boldsymbol{\lambda}}^{k}-\tilde{\boldsymbol{\lambda}}^{k}_{\text{A2}}\|$ , where $\tilde{\boldsymbol{\lambda}}^{k}$ and $\tilde{\boldsymbol{\lambda}}^{k}_{\text{A2}}$ are defined in (24) and (22), respectively. By exploiting the nonexpansiveness of the projection operator, we have

[TABLE]

Finally, by combining (64) and (63) with (62) we obtain the upper bound in Lemma 7. $\blacksquare$

-G Proof of Lemma 8

By substituting the bounds on the estimation errors of Lemma 5 into the error bound in Lemma 7, we obtain

[TABLE]

where $a_{1}$ , $a_{2}$ , $a_{3}$ and $a_{4}$ are positive constants defined as $a_{1}:=\theta B_{\Omega}(\|\alpha_{\text{d}}\|L_{\boldsymbol{F}}+\|\beta_{\text{d}}\|)+\theta B_{D}(\|\alpha_{\text{d}}\|\|C_{d}\|+\|\beta_{\text{d}}\|)$ , $a_{2}:=\theta B_{\Omega}\|\alpha_{\text{d}}\|L_{\boldsymbol{F}}+\theta B_{D}(\|\alpha_{\text{d}}\|\|C_{d}\|+\|\beta_{\text{d}}\|)$ , $a_{3}:=\|\beta_{\text{d}}\|\|C_{d}\|$ and $a_{4}:=\|\beta_{\text{d}}\|$ . Now, we show that each term on the right-hand side of (65) is summable, hence also the sequence $(\gamma^{k}\|e^{k}\|)_{k\in\mathbb{N}}$ is such, i.e., $\sum_{k=0}^{\infty}\gamma^{k}\|e^{k}\|<\infty$ .

Term 1: To establish the convergence of $\sum_{k=0}^{\infty}\gamma^{k}\rho^{k}$ , we note that $\gamma^{k}\leq\gamma^{0}$ , for all $k\in\mathbb{N}$ , by Assumption 7, implying that $\sum_{k=0}^{\infty}\gamma^{k}\rho^{k}\leq\gamma^{0}\sum_{k=0}^{\infty}\rho^{k}<\infty$ , since $0<\rho<1$ by Lemma 1.

Term 2: Since $\gamma^{k}\leq\gamma^{s-1}$ , for all $k\geq s\!-\!1$ (Assumption 7), the following relations hold for the second term in the right-hand side of (65):

[TABLE]

It follows by Lemma 6 (b) that $\sum_{k=0}^{\infty}\sum_{s=1}^{k}\rho^{k-s}(\gamma^{s-1})^{2}<\infty$ , since $\sum_{k=0}^{\infty}(\gamma^{k})^{2}<\infty$ , $\gamma^{k}\geq 0$ for all $k$ (Assumption 7) and $0<\rho<1$ .

Term 3: By exploting the definition of the sequence $(\phi^{k})_{k\in\mathbb{N}}$ in Lemma 5, we can write

[TABLE]

By exploiting the same technical reasoning in (i) and (ii), we can show that each term on the right-hand side of the previous inequality globally converges. Therefore, we conclude that $\sum_{k=0}^{\infty}\gamma^{k}\phi^{k}<\infty$ .

Term 4: Since $\gamma^{k}\leq\gamma^{s}$ , for all $k\geq s$ (Assumption 7), the following hold for the last term in the right-hand side of (65):

[TABLE]

It follows by Lemma 6 (b) that $\sum_{k=0}^{\infty}\sum_{s=1}^{k}\rho^{k-s}\gamma^{s-1}\phi^{s-1}<\infty$ , since $\sum_{k=0}^{\infty}\gamma^{k}\phi^{k}<\infty$ by (iii), and $0<\rho<1$ .

To conclude, since all the terms in the right-hand side of (65) are summable, then we have $\sum_{k=0}^{\infty}\gamma^{k}\|e^{k}\|<\infty$ . $\blacksquare$

Bibliography32

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] M. Jensen, “Aggregative games and best-reply potentials,” Economic Theory, Springer , vol. 43, pp. 45–66, 2010.
2[2] W. Saad, Z. Han, H. Poor, and T. Başar, “Game theoretic methods for the smart grid,” IEEE Signal Processing Magazine , pp. 86–105, 2012.
3[3] Z. Ma, D. Callaway, and I. Hiskens, “Decentralized charging control of large populations of plug-in electric vehicles,” IEEE Trans. on Control Systems Technology , vol. 21, no. 1, pp. 67–78, 2013.
4[4] N. Li, L. Chen, and M. A. Dahleh, “Demand response using linear supply function bidding,” IEEE Transactions on Smart Grid , vol. 6, no. 4, pp. 1827–1838, 2015.
5[5] J. Barrera and A. Garcia, “Dynamic incentives for congestion control,” IEEE Trans. on Automatic Control , vol. 60, no. 2, pp. 299–310, 2015.
6[6] S. Grammatico, F. Parise, M. Colombino, and J. Lygeros, “Decentralized convergence to Nash equilibria in constrained deterministic mean field control,” IEEE Trans. on Automatic Control , vol. 61, no. 11, pp. 3315–3329, 2016.
7[7] S. Grammatico, “Dynamic control of agents playing aggregative games with coupling constraints,” IEEE Trans. on Automatic Control , vol. 62, no. 9, pp. 4537 – 4548, 2017.
8[8] D. Paccagnan, B. Gentile, F. Parise, M. Kamgarpour, and J. Lygeros, “Nash and wardrop equilibria in aggregative games with coupling constraints,” IEEE Transactions on Automatic Control , vol. 64, no. 4, pp. 1373–1388, 2018.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Distributed generalized Nash equilibrium seeking in aggregative games on time-varying networks

Abstract

I Introduction

Contribution

Organization of the paper

Basic notation

Operator theoretic definitions

II Problem statement

Remark 1

Assumption 1

Definition 1** (Generalized Nash equilibrium)**

II-A Communication networks

Assumption 2

Assumption 3

Lemma 1** ([21, Lemma 5.3.1])**

II-B GNE as zeros of a monotone operator

Proposition 1

Proof:

Assumption 4

Assumption 5

Remark 2** (Existence and uniqueness of a v-GNE)**

II-C Boundedness of the dual variables

Lemma 2

Proof:

II-D A standard semi-decentralized algorithm

Remark 3

III A distributed GNE seeking algorithm

III-A Towards a fully distributed algorithm

Remark 4

Lemma 3

Proof:

Proposition 2

Proof:

Assumption 6

Theorem 1

Proof:

Remark 5** (Algorithm 2 as a fixed-point iteration)**

III-B A fully-distributed algorithm via dynamic tracking

Assumption 7

IV Convergence analysis

IV-A Analysis of the relaxed error sequence

Lemma 4

Proof:

Assumption 8

Lemma 5

Proof:

Lemma 6** ([31, Lemma 3.1])**

Proposition 3

Proof:

Lemma 7

Proof:

Lemma 8

Proof:

Theorem 2

Proof:

V Numerical simulations

V-A Generalized network Nash–Cournot game

V-B Simulations studies

VI Conclusion

-A Proof of Lemma 3

-B Proof of Proposition 2

-C Proof of Theorem 1

Lemma 9

Proof:

Lemma 10

Proof:

Lemma 11

Proof:

-D Proof of Lemma 4

-E Proof of Lemma 5

-F Proof of Lemma 7

-G Proof of Lemma 8

Definition 1 (Generalized Nash equilibrium)

Lemma 1 ([21, Lemma 5.3.1])

Remark 2 (Existence and uniqueness of a v-GNE)

Remark 5 (Algorithm 2 as a fixed-point iteration)

Lemma 6 ([31, Lemma 3.1])