Input-Feedforward-Passivity-Based Distributed Optimization Over Jointly   Connected Balanced Digraphs

Mengmou Li; Graziano Chesi; Yiguang Hong

arXiv:1905.03468·math.OC·May 2, 2022

Input-Feedforward-Passivity-Based Distributed Optimization Over Jointly Connected Balanced Digraphs

Mengmou Li, Graziano Chesi, Yiguang Hong

PDF

TL;DR

This paper introduces a novel passivity-based distributed optimization algorithm for directed graphs, ensuring exponential convergence without global information, and demonstrates its effectiveness through numerical examples.

Contribution

It proposes a new input feedforward passivity framework and a derivative feedback algorithm that work over directed, weight-balanced graphs without requiring eigenvalue knowledge.

Findings

01

The algorithm guarantees exponential convergence on strongly connected topologies.

02

It is robust to randomly changing weight-balanced digraphs.

03

Numerical examples validate the effectiveness of the proposed methods.

Abstract

In this paper, a distributed optimization problem is investigated via input feedforward passivity. First, an input-feedforward-passivity-based continuous-time distributed algorithm is proposed. It is shown that the error system of the proposed algorithm can be decomposed into a group of individual input feedforward passive (IFP) systems that interact with each other using output feedback information. Based on this IFP framework, convergence conditions of a suitable coupling gain are derived over weight-balanced and uniformly jointly strongly connected (UJSC) topologies. It is also shown that the IFP-based algorithm converges exponentially when the topology is strongly connected. Second, a novel distributed derivative feedback algorithm is proposed based on the passivation of IFP systems. While most works on directed topologies require knowledge of eigenvalues of the graph Laplacian, the…

Equations198

{\overset{x}{˙} = F (x, u) y = H (x, u)

{\overset{x}{˙} = F (x, u) y = H (x, u)

\dot{V} \leq u^{T} y, \forall (x, u) \in X \times U .

\dot{V} \leq u^{T} y, \forall (x, u) \in X \times U .

x min i \in N \sum f_{i} (x)

x min i \in N \sum f_{i} (x)

i \in N \sum \nabla f_{i} (x) = 0

i \in N \sum \nabla f_{i} (x) = 0

x min

x min

x_{i} = x_{j}, \forall i, j \in N

i \in N \sum \nabla f_{i} (x_{i}) = 0, x_{i} = x_{j}, \forall i, j \in N .

i \in N \sum \nabla f_{i} (x_{i}) = 0, x_{i} = x_{j}, \forall i, j \in N .

\overset{x}{˙}_{i}

\overset{x}{˙}_{i}

\dot{λ}_{i}

u_{i}

\overset{x}{˙}

\overset{x}{˙}

\dot{λ}

\overset{x}{˙}^{*}

\overset{x}{˙}^{*}

\dot{λ}^{*}

- (1_{N} \otimes I_{m})^{T} α \nabla f (x^{*}) - (1_{N} \otimes I_{m})^{T} λ^{*}

- (1_{N} \otimes I_{m})^{T} α \nabla f (x^{*}) - (1_{N} \otimes I_{m})^{T} λ^{*}

=

=

\overset{x}{˙}_{i}^{*}

\overset{x}{˙}_{i}^{*}

\dot{λ}_{i}^{*}

Σ_{i}, \forall i \in N :

Σ_{i}, \forall i \in N :

⎩ ⎨ ⎧ Δ \overset{x}{˙}_{i} Δ \dot{λ}_{i} y_{i} = - α (\nabla f_{i} (x_{i}) - \nabla f_{i} (x_{i}^{*})) - Δ λ_{i} + β u_{i} = - γ u_{i} = Δ x_{i}

u_{i} = σ (t) j \in N_{i} \sum a_{ij} (t) (y_{j} - y_{i}), \forall i \in N

u_{i} = σ (t) j \in N_{i} \sum a_{ij} (t) (y_{j} - y_{i}), \forall i \in N

\begin{array}[]{ll}V_{i}=&\frac{\eta_{i}}{2}\|z_{i}\|^{2}-\frac{1}{\gamma}\Delta x_{i}^{T}\Delta\lambda_{i}+\frac{\alpha}{\gamma}\left(f_{i}(x_{i}^{*})-f_{i}(x_{i})\right)\\ &+\frac{\alpha}{\gamma}\nabla f_{i}(x_{i}^{*})^{T}\Delta x_{i}\end{array}

\begin{array}[]{ll}V_{i}=&\frac{\eta_{i}}{2}\|z_{i}\|^{2}-\frac{1}{\gamma}\Delta x_{i}^{T}\Delta\lambda_{i}+\frac{\alpha}{\gamma}\left(f_{i}(x_{i}^{*})-f_{i}(x_{i})\right)\\ &+\frac{\alpha}{\gamma}\nabla f_{i}(x_{i}^{*})^{T}\Delta x_{i}\end{array}

ν_{i} = - η_{i} min x_{i} max \frac{η _{i} ( α β \nabla ^{2} f _{i} ( x _{i} ) - γ I ) - \frac{β}{γ} I ^{2}}{4 ( μ _{i} η _{i} α - \frac{1}{γ} )} .

ν_{i} = - η_{i} min x_{i} max \frac{η _{i} ( α β \nabla ^{2} f _{i} ( x _{i} ) - γ I ) - \frac{β}{γ} I ^{2}}{4 ( μ _{i} η _{i} α - \frac{1}{γ} )} .

ν_{i} \geq

ν_{i} \geq

\geq

0 < σ (t) < \frac{s _{+} ( L ( t ) + L ^{T} ( t ) )}{- 2 ν ˉ s _{N} ( L ^{T} ( t ) L ( t ) )}, \forall t > 0

0 < σ (t) < \frac{s _{+} ( L ( t ) + L ^{T} ( t ) )}{- 2 ν ˉ s _{N} ( L ^{T} ( t ) L ( t ) )}, \forall t > 0

\frac{1}{2} - σ (t) ∣ ν_{i} ∣ d^{i} (t) > 0, \forall i \in N

\frac{1}{2} - σ (t) ∣ ν_{i} ∣ d^{i} (t) > 0, \forall i \in N

\dot{V} \leq

\dot{V} \leq

=

=

- \frac{σ ( t )}{2} i \in N \sum j \in N_{i} (t) \sum a_{ij} (t) (y_{i}^{T} y_{i} - y_{j}^{T} y_{j}) - ν_{i} u_{i}^{T} u_{i}

=

- \frac{σ ( t )}{2} (1_{N}^{T} \otimes I_{m}) L (t) (Y^{T} Y)

- i \in N \sum ν_{i} σ (t) j \in N_{i} (t) \sum a_{ij} (t) (y_{j} - y_{i})^{2}

=

- σ^{2} (t) i \in N \sum ν_{i} j \in N_{i} (t) \sum a_{ij}^{\frac{1}{2}} (t) \cdot a_{ij}^{\frac{1}{2}} (t) (y_{j} - y_{i})^{2}

\leq

- σ^{2} (t) i \in N \sum ν_{i} j \in N_{i} (t) \sum a_{ij} (t) j \in N_{i} (t) \sum a_{ij} (t) ∥ y_{j} - y_{i} ∥^{2}

=

\leq

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Input-Feedforward-Passivity-Based Distributed Optimization Over Jointly Connected Balanced Digraphs

Mengmou Li, Graziano Chesi, and Yiguang Hong A preliminary version of this work was presented in the 58th IEEE Conference on Decision and Control, Nice, France [1].The work of Y. Hong was supported by National Natural Science Foundation of China under Grant 61733018.M. Li and G. Chesi are with the Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, China (e-mail: [email protected]; [email protected]).Y. Hong is with the Key Laboratory of Systems and Control, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, 100190, Beijing, China (e-mail: [email protected]).

Abstract

In this paper, a distributed optimization problem is investigated via input feedforward passivity. First, an input-feedforward-passivity-based continuous-time distributed algorithm is proposed. It is shown that the error system of the proposed algorithm can be decomposed into a group of individual input feedforward passive (IFP) systems that interact with each other using output feedback information. Based on this IFP framework, convergence conditions of a suitable coupling gain are derived over weight-balanced and uniformly jointly strongly connected (UJSC) topologies. It is also shown that the IFP-based algorithm converges exponentially when the topology is strongly connected. Second, a novel distributed derivative feedback algorithm is proposed based on the passivation of IFP systems. While most works on directed topologies require knowledge of eigenvalues of the graph Laplacian, the derivative feedback algorithm is fully distributed, namely, it is robust against randomly changing weight-balanced digraphs with any positive coupling gain and without knowing any global information. Finally, numerical examples are presented to illustrate the proposed distributed algorithms.

Index Terms:

Continuous-time algorithms, input feedforward passivity, weight-balanced digraphs, uniformly jointly strongly connected topologies, derivative feedback.

I Introduction

Distributed optimization over multi-agent systems has been widely investigated in recent years, due to its broad applications in various aspects including wireless networks, smart grids, and machine learning. In addition to discrete-time algorithms (e.g., [2, 3, 4]), a variety of continuous-time distributed algorithms have been proposed to solve distributed optimization problems [5, 6, 7, 8]. Continuous-time algorithms can be implemented in hardware devices like analog circuits [9], and achieve tasks such as motion coordination of multi-agent systems [10]. Studying optimization in the continuous-time domain benefits from numerous control techniques for stability analysis and also opens up the possibility to address commonly encountered problems in large-scale networks, such as disturbance rejection [11], robustness to delays or uncertainties [12, 13], or channel constraints [14]. However, most of the proposed algorithms are only for undirected topologies and not applicable to directed topologies [5, 6, 7, 8]. To deal with this difficulty, some parameters in the algorithms can be tuned to stabilize the dynamics [15, 16], while some variants of the standard proportional-integral algorithm are proposed [14, 17]. However, most of these methods often employ coordinate transformation along with complicated Lyapunov function candidates in convergence analysis, which does not preserve network structures, and requires eigenvalues of the graph Laplacian to design some parameters [14, 15, 17, 18]. Compared with these methods, a systematic approach that focuses more on the distributed interconnection of agents in the network is needed.

It is well known that dissipativity (as well as its special case, passivity) is a useful tool for stability analysis and control design [19, 20, 21]. Recently, there emerged some passivity-based algorithms on distributed optimization under some communication constraints [22, 23, 13, 24]. However, these passivity-based algorithms can only be applied over undirected graphs, while it is shown that output consensus can be achieved over directed graphs through simple output feedback interconnections of passive systems [19, 20]. Motivated by these works, we aim to study distributed algorithms over directed graphs via passivity techniques. On one hand, we conjecture that it is in general difficult to directly construct a distributed algorithm that can be interpreted as output feedback interconnections of passive systems. On the other hand, works in [25, 26, 27] point out that output consensus can be achieved over directed graphs even among IFP (or passivity-short) systems. Therefore, if a distributed algorithm inherits input feedforward passivity, it can be directly applied to weight-balanced digraphs through output feedback interconnections. As a byproduct of having the IFP properties, the distributed algorithm is also applicable over uniformly jointly strongly connected (UJSC) topologies. This feature is remarkable since it greatly reduces communication costs, and hence is more practical in large-scale networks. Though the problem of UJSC switching topologies has been considered in discrete-time algorithms [2, 3, 4], to the best of our knowledge, it has never been addressed in the continuous-time domain, due to the difficulties in stability analysis under the time-varying nature and lack of connectedness of topologies.

In this paper, we investigate the distributed optimization problem via input feedforward passivity. First, we propose an IFP-based distributed algorithm whose error system is decomposed into a group of individual IFP systems that interact with each other using output feedback information. Based on this IFP framework, we study the distributed algorithm over directed and UJSC weight-balanced topologies and derive convergence conditions of a suitable coupling gain for the algorithm. We also show that this IFP-based algorithm converges exponentially when the graph is strongly connected. Second, we propose a novel distributed derivative feedback algorithm based on the passivation of IFP systems. While most works on directed topologies in the literature require knowledge of eigenvalues of the graph Laplacian [14, 15, 17, 18], we show that the derivative feedback algorithm is fully distributed, namely, it is robust against randomly changing weight-balanced digraphs with any positive coupling gain and without knowing any global information. In other words, the derivative feedback algorithm is applicable over gossip-like balanced digraphs [28], reducing communication costs. It is worth mentioning that [16] develops a fully distributed adaptive algorithm. However, it does not apply to switching or UJSC topologies. The challenges in our work lie in the construction of a group of verifiable nonlinear IFP systems that solves the distributed optimization problem, the design of the fully distributed algorithm, and the convergence analysis of the proposed algorithms.

Moreover, our analytical method differs from most works from the literature in that we first characterize passivity from single-agent level, and then address the stability based on the output feedback interconnection model of these agents over networks. This method compares favorably to some other works [15, 14, 17] since it bypasses coordinate transformation and preserves network structures in convergence analysis. Besides, it also allows potential applications of mature passivity-based techniques in the study of network issues arising in distributed optimization.

A preliminary version of this work appeared in [1], where only the IFP-based algorithm has been proposed. In this work, we propose the IFP-based algorithm with a possibly time-varying coupling parameter, construct more practical conditions that are easier to verify in a distributed sense, and show exponential convergence of the IFP-based algorithm over strongly connected digraphs. Moreover, a fully distributed algorithm is proposed.

The rest of this paper is organized as follows. In Section II, some background knowledge of convex analysis, graph theory, and passivity is reviewed and the problem formulation is given. In Section III, an IFP-based distributed algorithm is proposed and studied over weight-balanced UJSC topologies. In Section IV, a fully distributed algorithm over weight-balanced UJSC digraphs is proposed. In Section V, numerical examples are presented to illustrate effects of the two algorithms. Finally, the paper is concluded in Section VI.

II Preliminaries and Problem Formulation

II-A Notation

Let $\mathbb{R}$ and $\mathbb{Z}$ be the sets of real and integer numbers, respectively. The Kronecker product is denoted as $\otimes$ . Let $\left\lVert\cdot\right\rVert$ denote the 2-norm of a vector and also the induced 2-norm of a matrix. The determinant of a square matrix $M$ is denoted as $\text{det}(M)$ . Given a symmetric matrix $M\in\mathbb{R}^{m\times m}$ , the notation $M>0$ $(M\geq 0)$ means that $M$ is positive definite (positive semi-definite). Denote the eigenvalues of $M$ in ascending order as $s_{1}(M)\leq s_{2}(M)\leq\ldots\leq s_{m}(M)$ . Let $I$ and $\mathbf{0}$ denote the identity matrix and zero matrix (or vector) of proper sizes, respectively. $\mathbf{1}_{m}:=(1,\ldots,1)^{T}\in\mathbb{R}^{m}$ denotes the vector of $m$ ones. $\text{col}(v_{1},\ldots,v_{m}):=(v_{1}^{T},\ldots,v_{m}^{T})^{T}$ denotes the column vector stacked with vectors $v_{1},\ldots,v_{m}$ . The notation $\text{diag}\{\alpha_{1},\ldots,\alpha_{m}\}$ denotes a (block) diagonal matrix with its $i$ th diagonal element (block) being $\alpha_{i}$ . The notation $\mathcal{C}^{k}$ is used to denote a $k\in\mathbb{Z}_{\geq 1}$ times continuously differentiable function.

II-B Convex Analysis

A differentiable function $f:\mathbb{R}^{m}\rightarrow\mathbb{R}$ is convex over a convex set $\mathcal{X}\subset\mathbb{R}^{m}$ if and only if $\left(\nabla f(x)-\nabla f(y)\right)^{T}(x-y)\geq 0$ , $\forall x,~{}y\in\mathcal{X}$ , and is strictly convex if and only if the strict inequality holds for any $x\neq y$ . It is $\mu$ -strongly convex if and only if $\left(\nabla f(x)-\nabla f(y)\right)^{T}(x-y)\geq\mu\|x-y\|^{2}$ , $\forall x,y\in\mathcal{X}$ . An equivalent condition for the strong convexity is the following: $f(y)\geq f(x)+\nabla f(x)^{T}(y-x)+\frac{\mu}{2}\|y-x\|^{2}$ , $\forall x,y\in\mathcal{X}$ . An operator $\mathbf{f}:\mathbb{R}^{m}\rightarrow\mathbb{R}^{m}$ is $l$ -Lipschitz continuous over a set $\mathcal{X}\in\mathbb{R}^{m}$ if $\|\mathbf{f}(x)-\mathbf{f}(y)\|\leq l\|x-y\|$ , $\forall x,y\in\mathcal{X}$ .

II-C Graph Theory

The information exchanging network is represented by a graph $\mathcal{G}=(\mathcal{N},\mathcal{E})$ , where $\mathcal{N}=\{1,\ldots,N\}$ is the node set of all agents, $\mathcal{E}\subset\mathcal{N}\times\mathcal{N}$ is the edge set. The edge $(j,i)\in\mathcal{E}$ means that agent $i$ can obtain information from agent $j$ , and $j\in\mathcal{N}_{i}$ , where $\mathcal{N}_{i}=\{i\in\mathcal{N}~{}|~{}(j,i)\in\mathcal{E}\}$ is agent $i$ ’s neighbor set. We assume in this work that there are no self-loops in $\mathcal{G}$ , i.e., $(i,i)\notin\mathcal{E}$ and $i\notin\mathcal{N}_{i}$ . The graph $\mathcal{G}$ is said to be undirected if $(i,j)\in\mathcal{E}\Leftrightarrow(j,i)\in\mathcal{E}$ and directed otherwise. A sequence of successive edges $\{(i,p),(p,q),\ldots,(v,j)\}$ is a directed path from agent $i$ to agent $j$ . $\mathcal{G}$ is said to be strongly connected if there exists a directed path between any two agents. The adjacency matrix is defined as $\mathcal{A}=[a_{ij}]$ , where $a_{ii}=0$ ; $a_{ij}>0$ if $(j,i)\in\mathcal{E}$ , and $a_{ij}=0$ , otherwise. The in-degree and out-degree of the $i$ th agent are $d_{in}^{i}=\sum_{j=1}^{N}a_{ij}$ and $d_{out}^{i}=\sum_{j=1}^{N}a_{ji}$ , respectively. The graph $\mathcal{G}$ is said to be weight-balanced if $d_{in}^{i}=d_{out}^{i},~{}\forall i\in\mathcal{N}$ . The in-degree matrix is $W_{in}=\text{diag}\{d_{in}^{1},\ldots,d_{in}^{N}\}$ . The Laplacian matrix of $\mathcal{G}$ is defined as $L=W_{in}-\mathcal{A}$ . When $\mathcal{G}$ is weight-balanced, it satisfies that $\mathbf{1}_{N}^{T}L=\mathbf{0}$ and $L\mathbf{1}_{N}=\mathbf{0}$ . A time-varying graph $\mathcal{G}(t)$ is said to be uniformly jointly strongly connected (UJSC) if there exists a $T>0$ such that for any $t_{k}$ , the union $\cup_{t\in[t_{k},t_{k}+T]}\mathcal{G}(t)$ is strongly connected.

II-D Passivity

Consider a nonlinear dynamics described by

[TABLE]

where $x\in\mathcal{X}\subset\mathbb{R}^{n}$ , $u\in\mathcal{U}\subset\mathbb{R}^{m}$ and $y\in\mathcal{Y}\subset\mathbb{R}^{m}$ are the state, input and output, respectively, and $\mathcal{X}$ , $\mathcal{U}$ and $\mathcal{Y}$ are the state, input and output spaces, respectively. The functions $F:\mathcal{X}\times\mathcal{U}\rightarrow\mathbb{R}^{n}$ , $H:\mathcal{X}\times\mathcal{U}\rightarrow\mathcal{Y}$ represent system and output dynamics, respectively, and are assumed to be sufficiently smooth, i.e., $\mathcal{C}^{n}$ for large enough integer $n$ .

Let us give the definition of passivity and input feedforward passivity for a nonlinear system based on [29, Definition 6.3], [30, Definition 2.12].

Definition 1.

System (1) is said to be passive from $u$ to $y$ if there exists a continuously differentiable positive semi-definite function $V(x)$ , called the storage function, such that

[TABLE]

Moreover, it is said to be input feedforward passive (IFP) if $\dot{V}\leq u^{T}y-\nu u^{T}u$ , for some $\nu\in\mathbb{R}$ , denoted as IFP( $\nu$ ).

The sign of the IFP index $\nu$ denotes an excess or shortage of passivity. Specifically, when $\nu>0$ , the system is said to be input strictly passive (ISP). When $\nu<0$ , the system is said to be input feedforward passivity-short (IFPS). If we define a new output as $\tilde{y}=y-\nu u$ , then the IFP system becomes passive from $u$ to $\tilde{y}$ . Throughout this paper, we consider the storage function to be positive definite and radially unbounded.

II-E Problem Formulation

Let us formulate the problem and give some necessary assumptions in this subsection. Consider the distributed convex optimization problem among a group of agents in the node set $\mathcal{N}=\{1,\ldots,N\}$ ,

[TABLE]

where $\mathrm{x}\in\mathbb{R}^{m}$ and each local objective function $f_{i}:\mathbb{R}^{m}\rightarrow\mathbb{R}$ satisfies the following assumption.

Assumption 1.

Each $f_{i}(\mathrm{x})$ is $\mathcal{C}^{2}$ and $\mu_{i}$ -strongly convex, with its gradient $\nabla f_{i}(\mathrm{x})$ being $l_{i}$ -Lipschitz continuous.

This assumption also implies that $\|\nabla f_{i}(\mathrm{x})-\nabla f_{i}(\mathrm{x}^{\prime})\|\leq l_{i}\|\mathrm{x}-\mathrm{x}^{\prime}\|$ and $\mu_{i}I\leq\nabla^{2}f_{i}(\mathrm{x})\leq l_{i}I$ , $\forall\mathrm{x},\mathrm{x}^{\prime}\in\mathbb{R}^{m}$ . Note that 1 is widely adopted in the literature, see, e.g., [14, 31]. It is required in this paper to ensure IFP properties and to estimate IFP indices of agents. In addition, it is shown later that the Lipschitz requirement can be relaxed by selecting proper parameters in the algorithms.

Under 1, the necessary and sufficient condition of optimality for problem (3) is [32, Section 5.5.3]

[TABLE]

Denote $x_{i}\in\mathbb{R}^{m}$ as agent $i$ ’s local estimation of the global optimal solution and let $x=\text{col}(x_{1},\ldots,x_{N})$ , then problem (3) is equivalent to [15]

[TABLE]

where the constraints are consensus constraints for agents to reach a common value. Under 1 and due to (4), the optimal solution to problem (5) should satisfy

[TABLE]

Consider the distributed optimization over UJSC weight-balanced digraphs. To the best of our knowledge, this problem has never been addressed in the continuous-time domain.

Assumption 2.

The agents interact with each other through a sequence of UJSC digraphs $\{\mathcal{G}(t)\}$ , where $\mathcal{G}(t)$ is weight-balanced pointwise in time and $L(t)\neq\mathbf{0}$ , $\forall t\geq 0$ .

This assumption does not restrict the switching logic of $\mathcal{G}(t)$ provided it is UJSC for a finite $T$ . Note that the time interval $T$ is only imposed to ensure convergence performance, and our results in this work hold as long as $\mathcal{G}(t)$ is strongly connected in a probabilistic sense [28]. We will propose two algorithms in the following sections. The information of $L(t)$ is required for the first algorithm, while it is not used at all for the second algorithm. Here the trivial case of $L(t)=\mathbf{0}$ is omitted.

III IFP-Based Distributed Algorithm

In this section, we propose a distributed algorithm based on input feedforward passivity and study its stability over UJSC balanced topologies.

III-A IFP-Based Distributed Algorithm

We propose an IFP-based distributed algorithm as follows.

$x_{i},\lambda_{i},\in\mathbb{R}^{m}$ and $u_{i}\in\mathbb{R}^{m}$ are local variables and input for the $i$ th agent, respectively; $\alpha>0$ , $\beta\in\mathbb{R}$ and $\gamma>0$ are constant parameters and $\sigma(t)>0$ is the coupling gain for the diffusive couplings (7c). To ease the discussion on parameters, we assume that $\alpha,\beta,\gamma$ are arbitrary parameters, while $\sigma(t)$ is a finite and possibly time-varying coupling gain to be designed. The initial condition $\sum_{i\in\mathcal{N}}\lambda_{i}(0)=\mathbf{0}$ is required to ensure the optimality of the equilibrium point, which will be specified in subsequent analysis. A simple initial choice can be $\lambda_{i}(0)=\mathbf{0}$ , $\forall i\in\mathcal{N}$ .

Initially, each agent in (7a) estimates the optimal value by local gradient descent. Since $f_{i}$ , $\forall i\in\mathcal{N}$ may not be the same, an auxiliary variable $\lambda_{i}$ is introduced to compensate for the difference of local gradients and ensure the existence of an equilibrium. Then, a diffusive coupling protocol (7c) is added to (7a), (7b) in order to drive the dynamics to reach a consensus on the final optimal value. Algorithm 1 is a distributed algorithm since each agent only exchanges information with neighboring agents.

Denote $x=\text{col}(x_{1},\ldots,x_{N})$ , $\lambda=\text{col}(\lambda_{1},\ldots,\lambda_{N})$ . Agents in Algorithm 1 are interconnected through diffusive couplings $u_{i}$ , $\forall i\in\mathcal{N}$ . By eliminating $u_{i}$ , the compact form of the overall closed-loop system is written as

[TABLE]

where $\mathbf{L}(t)=L(t)\otimes I_{m}$ , and $L(t)$ is the graph Laplacian of $\mathcal{G}$ .

Remark 1.

Algorithm 1* in the form of (8) is a generalization of algorithms developed in [14]. Specifically, let $\sigma(t)=1$ and $\gamma=\alpha\beta$ , then Algorithm 1 reduces to the distributed algorithm in [14]. When $\alpha=\gamma=\sigma(t)=1$ , and $\beta=0$ , Algorithm 1 reduces to the simplified algorithm in [14]. Compared with [14], Algorithm 1 includes more general cases whose convergence cannot be proved by methods in [14], e.g., when $\sigma(t)$ is time-varying, when $\beta$ is negative, and when $\gamma$ is independent of $\alpha,\beta$ . Moreover, it is shown later that this generalized algorithm is valid over UJSC topologies in addition to directed and strongly connected switching topologies [14], and has an exponential convergence rate when the graph is strongly connected.*

Lemma 1.

Under Assumptions 1 and 2. If there exists an equilibrium point $(x^{*},\lambda^{*})$ to system (8) that satisfies $\sum_{i\in\mathcal{N}}\lambda_{i}^{*}=\mathbf{0}$ , where $x^{*}=\text{col}(x_{1}^{*},\ldots,x_{N}^{*})$ , $\lambda^{*}=\text{col}(\lambda_{1}^{*},\ldots,\lambda_{N}^{*})$ , then $(x^{*},\lambda^{*})$ is also unique with $x_{i}^{*}$ being the optimal solution to problem (3).

Proof.

The equilibrium point $(x^{*},\lambda^{*})$ satisfies

[TABLE]

where the term $-\sigma(t)\beta\mathbf{L}(t)x^{*}$ in (9a) is zero and omitted since (9b) implies $\mathbf{L}(t)x^{*}\equiv\mathbf{0}$ . Since the graph is UJSC, $\mathbf{L}(t)x^{*}\equiv\mathbf{0}$ for all $t$ implies that $x_{i}^{*}=x_{j}^{*},~{}\forall i,j\in\mathcal{N}$ . Next, multiplying (9a) by $(\mathbf{1}_{N}\otimes I_{m})^{T}$ from the left, one has,

[TABLE]

which satisfies (6). Therefore, $x_{i}^{*}$ is the optimal solution to problem (3). Besides, the strong convexity of $f(x)$ in 1 implies that $x^{*}$ is unique [32, Section 9.1.2]. Thus, by (9a), $\lambda^{*}$ is unique as well. ∎

Hereafter, we call $(x^{*},\lambda^{*})$ the optimal point. The convergence of Algorithm 1 will be addressed in Section III-C.

III-B Input Feedforward Passivity of the Error System

In this subsection, we show that the error subsystem of each agent inherits the input feedforward passivity, which is a crucial step before the convergence analysis over UJSC balanced digraphs, and the design of a passivated algorithm in Section IV.

By Lemma 1, for agent $i$ , one has

[TABLE]

Denote $\Delta x_{i}=x_{i}-x_{i}^{*}$ , $\Delta\lambda_{i}=\lambda_{i}-\lambda_{i}^{*}$ . Then, the group of error subsystems between (7) and (10) is

[TABLE]

where $y_{i}$ is defined as the output of the $i$ th error subsystem. Then the input $u_{i}$ , $\forall i\in\mathcal{N}$ can be rewritten as

[TABLE]

or $u=-\sigma(t)\mathbf{L}(t)y$ in a compact form, where $u=\text{col}(u_{1},\ldots,u_{N})$ , $y=\text{col}(y_{1},\ldots,y_{N})$ . Assume that, corresponding to the real agents, there exists a group of virtual agents such that the $i$ th virtual agent possesses the subsystem $\Sigma_{i}$ . Then, Algorithm 1 can be seen as output feedback interconnections of these virtual agents. In fact, no information of $(x_{i}^{*},\lambda_{i}^{*})$ is needed for communication since $y_{i}-y_{j}=\Delta x_{i}-\Delta x_{j}=x_{i}-x_{j}$ . Then, each agent possesses the same information as its corresponding virtual agent.

Next, we show that each error subsystem $\Sigma_{i}$ in (11) is IFP( $\nu_{i}$ ) with index $\nu_{i}\leq 0$ .

Lemma 2.

Under 1, each error subsystem $\Sigma_{i}$ in (11) is IFP( $\nu_{i}\leq 0$ ) from input $u_{i}$ to output $y_{i}$ with respect to the storage function

[TABLE]

where $\eta_{i}>\frac{1}{\mu_{i}\alpha\gamma}$ and $z_{i}=\alpha\left(\nabla f_{i}(x_{i})-\nabla f_{i}(x_{i}^{*})\right)+\Delta\lambda_{i}$ .

Proof.

See the Appendix. ∎

As pointed out by [27], it is in general difficult to derive the exact IFP index for a nonlinear system, and only its lower bound can be obtained by specifying the storage function. With the storage function (2), the lower bound of the exact IFP index can be obtained locally by solving the minimax problem

[TABLE]

When each $f_{i}$ is quadratic, $\forall i\in\mathcal{N}$ , the error system (11) becomes a linear system. The exact IFP index for a linear system can be easily obtained by solving an LMI related to the positive real lemma [33, Lemma 2]. The problem of reducing this gap between the lower bound and the exact index of IFP remains open and is left for the future work.

Remark 2.

It is in general not difficult to obtain $\nu_{i}$ by solving (13) since local objective functions are usually of simple forms. Even when the local objective functions are complicated, problem (13) can be relaxed to

[TABLE]

where $l_{i}$ is the Lipschitz index defined in 1. Here (14) can be easily solved, providing a lower bound of the exact IFP index, which we can denote as the new $\nu_{i}$ . It can also be observed that when $\beta=0$ , (13) reduces to $\nu_{i}=-\min_{\eta_{i}}\frac{\eta_{i}^{2}\gamma^{2}}{4\left(\mu_{i}\eta_{i}\alpha-\frac{1}{\gamma}\right)}=-\frac{\gamma}{\mu_{i}^{2}\alpha^{2}}$ . The IFP index of agent $i$ is only related to the strong convexity index $\mu_{i}$ . In this case, the Lipschitz continuity of the gradients is not required.

III-C Algorithm Over UJSC Balanced Topologies

In this subsection, we analyze the convergence of Algorithm 1 over weight-balanced and UJSC switching topologies based on output feedback interconnections of subsystems $\Sigma_{i}$ in (11). Meanwhile, the effort in constructing candidate Lyapunov functions in convergence analysis is greatly reduced.

Definition 2.

The group of agents $\Sigma_{i}$ , $\forall i\in\mathcal{N}$ is said to achieve output consensus if their outputs satisfy $\lim_{t\rightarrow\infty}\left\lVert y_{i}(t)-y_{j}(t)\right\rVert=0,~{}\forall i,j\in\mathcal{N}$ .

Theorem 1 ([1]).

Under Assumptions 1 and 2, the states of Algorithm 1 will converge to the optimal point and solve problem (3) if $\sum_{i\in\mathcal{N}}\lambda_{i}(0)=\mathbf{0}$ and the coupling gain $\sigma(t)$ satisfies

[TABLE]

where $\bar{\nu}<0$ is the smallest value of IFP index $\nu_{i},~{}i\in\mathcal{N}$ , $s_{+}(\cdot)$ denotes the nonzero smallest eigenvalue, and $s_{N}(\cdot)$ was defined in Section II-A.

It can be proved through the Lyapunov function $V=\sum_{i\in\mathcal{N}}V_{i}$ , where $V_{i}$ was defined in (2), and by the fact that $L(t)+L^{T}(t)$ and $L^{T}(t)L(t)$ have the same null space. The details of the proof with constant $\sigma$ can be found in the conference paper [1]. Condition (15) requires the calculation of eigenvalues, which may be difficult to verify in a large-scale network. Thus, a more practical condition is derived in a different manner as follows, which is easier to verify or estimate for the design of the coupling gain in a distributed sense.

Theorem 2.

Under Assumptions 1 and 2, the states of Algorithm 1 with initial condition $\sum_{i\in\mathcal{N}}\lambda_{i}(0)=\mathbf{0}$ will converge to the optimal point and solve problem (3) if the positive coupling gain $\sigma(t)$ satisfies

[TABLE]

where $d^{i}(t)$ is the in/out-degree of the $i$ th agent.

Proof.

Let $V=\sum_{i\in\mathcal{N}}V_{i}$ , where $V_{i}$ was defined in (2). By the proof of Lemma 2, $\left\lVert\begin{smallmatrix}\Delta x\\ \Delta\lambda\end{smallmatrix}\right\rVert\rightarrow\infty\Rightarrow V\rightarrow\infty$ , thus, $V$ is radially unbounded. Suppose (16) holds. Then, following Lemma 2, the derivative of $V$ gives

[TABLE]

where $Y^{T}Y:=col\left(y_{1}^{T}y_{1},\ldots,y_{N}^{T}y_{N}\right)$ , the third equality follows from (12), the fourth equality follows from $\left(\mathbf{1}^{T}_{N}\otimes I_{m}\right)\mathbf{L}(t)=\left(\mathbf{1}^{T}_{N}L(t)\right)\otimes I_{m}=\mathbf{0}$ , the second inequality follows from the Cauchy-Schwarz inequality, and the last inequality follows from (16).

Then $\lim_{t\rightarrow+\infty}V(t)$ exists and is finite. $\dot{V}\leq 0$ implies that the states $\Delta x$ , $\Delta\lambda$ are bounded. The systems trajectories are bounded within the domain $\mathcal{S}_{0}=\{\left(\Delta x,\Delta\lambda\right)~{}|~{}V(t)\leq V(0)\}$ . By the first term in $\dot{V}$ (see (33) in the appendix) and the jointly connectedness of $\mathcal{G}(t)$ , $\dot{V}\equiv 0$ only if $z_{i}=\mathbf{0}$ and $y_{i}=y_{j}$ , $\forall i,j\in\mathcal{N}$ , where $z_{i}$ was defined in Lemma 2. Define the domain $\mathcal{S}_{z}:=\left\{\left(\Delta x,\Delta\lambda\right)~{}|~{}z_{i}=\mathbf{0},y_{i}=y_{j},\forall i,j\in\mathcal{N}\right\}$ . Clearly, $\left\|\begin{smallmatrix}\Delta\dot{x}\\ \Delta\dot{\lambda}\end{smallmatrix}\right\|$ is bounded for any bounded $\Delta x$ , $\Delta\lambda$ . Invoking the LaSalle’s Invariance Principle for nonautonomous systems [34], we conclude that the system states ultimately reach the domain $\mathcal{S}_{0}\cap\mathcal{S}_{z}$ . Then output consensus is achieved by Definition 2. Recalling (12), one has $u=\mathbf{0}$ when output consensus is achieved. Therefore, $\Delta\dot{x}\rightarrow\mathbf{0}$ , $\Delta\dot{\lambda}\rightarrow\mathbf{0}$ , or equivalently, $\dot{x}\rightarrow\mathbf{0}$ , $\dot{\lambda}\rightarrow\mathbf{0}$ as $t\rightarrow\infty$ , i.e., the states of (8) asymptotically converge to an equilibrium point.

Since $\lambda-\lambda(0)=\int_{0}^{t}\dot{\lambda}(\tau)d\tau$ , given the initial condition $\sum_{i\in\mathcal{N}}\lambda_{i}(0)=\mathbf{0}$ ,

[TABLE]

where the third equality follows from the Kronecker product and the last follows from $\mathbf{1}_{N}^{T}L(\tau)=\mathbf{0}$ . Then Lemma 1 holds, implying that the equilibrium point is the unique optimal point. Consequently, the states of Algorithm 1 will asymptotically converge to the optimal point. ∎

The case of fixed directed topologies can be seen as a special case of switching topologies, then the convergence is also guaranteed. Readers can refer to our conference paper [1] for more technical details. In addition, an exponential convergence rate can be obtained when the graph is strongly connected, as stated in the following.

Theorem 3.

Suppose that the conditions in Theorem 2 hold. In addition, if the communication digraph $\mathcal{G}$ is fixed, strongly connected and weight-balanced, and the coupling gain $\sigma(t)\geq\underline{\sigma}>0$ for a constant $\underline{\sigma}$ , then the states of Algorithm 1 with initial condition $\sum_{i\in\mathcal{N}}\lambda_{i}(0)=\mathbf{0}$ will exponentially converge to the optimal point.

Proof.

See the Appendix. ∎

It can be observed from the proof that the exponential convergence also holds over time-varying strongly connected weight-balanced graphs provided $0<\underline{d}\leq d^{i}(t)\leq\bar{d}$ , $\forall i\in\mathcal{N}$ for some constants $\underline{d}$ , $\bar{d}$ .

Remark 3.

Note that only weight-balanced graphs are considered here. The consensus over unbalanced graphs can be guaranteed similarly [25, 27] with $V=\sum_{i\in\mathcal{N}}\xi_{i}V_{i}$ , where $\xi_{i}>0$ is the $i$ th element of the left eigenvalue of $L$ . However, the sum of local objective functions will have a shift from the global optimum [4]. Thus, some modification is needed. This problem may be solved by adding a state to estimate the left eigenvalues of $L$ (e.g., [18]), which we will leave to future work.

III-D Discussion on the coupling gain

In this subsection, we proceed to discuss the parameters and the design of the coupling gain $\sigma(t)$ for Algorithm 1.

By Lemma 2, the subsystem $\Sigma_{i}$ is IFP regardless of values of $\alpha,\beta,\gamma$ . Let $\sigma_{e}=\frac{1}{2\max_{i}\{d^{i}(t)|\nu_{i}|\}}$ be the threshold of $\sigma(t)$ when $\max_{i}\{d^{i}(t)|\nu_{i}|\}\neq 0$ . Clearly, $\sigma_{e}>\sigma(t)>0$ by Theorem 2, meaning that there always exists a small enough $\sigma(t)$ to synchronize the outputs. Thus, $\alpha,\beta,\gamma$ can be arbitrarily chosen within the range specified in Algorithm 1. Intuitively, the larger $\alpha$ , $\beta$ , $\gamma$ are, the faster the convergence rate is. However, the choices of these parameters will affect the IFP index by (13), and hence affect the feasible range of $\sigma(t)$ .

In fact, for proper parameters, there is usually a wide feasible range for the coupling gain. Let us take for instance the quadratic functions (i.e., linear time-invariant systems in (11)) from the perspective of passivity, with $\alpha=\beta=\gamma=1$ . When the strong convexity parameter $\mu_{i}\gg 1$ , it can be shown by solving the LMI in [33, Lemma 2] that the IFP index $\nu_{i}$ is infinitesimal for each agent. Therefore, $\sigma_{e}$ can be arbitrarily large based on the above theorems, which corresponds with the observation in [14, Remark 2], where it is said that $\sigma$ can be chosen to be any positive value for the algorithm to converge in numerical examples, rendering fully distributed in practice. However, this is in general not true. When $\mu_{i}\ll 1$ , each agent is IFPS with a large-magnitude index, which indicates that the coupling gain cannot be arbitrarily large. The trajectories of systems are not guaranteed to converge if $\sigma$ is not within the feasible range. A numerical example is shown in Section V for this discussion. Consequently, the design of coupling gain is not fully distributed and requires global information like Laplacian eigenvalues or in/out-degrees.

Theorems 1 and 2 provide sufficient conditions for convergence for Algorithm 1, where the former requires eigenvalues related to the Laplacian and the latter only uses in/out-degrees. The calculation of eigenvalues could be time-consuming, especially in a large-scale network. There are many distributed algorithms to estimate Laplacian eigenvalues, e.g., [35, 36]. However, these algorithms are obviously not as simple as obtaining the maximum value of $d^{i}|\nu_{i}|$ needed in Theorem 2 by locally comparing them among neighboring agents. For example, when the graph is fixed and strongly connected, let $D_{i}(0)=d^{i}|\nu_{i}|$ for agent $i$ , $\forall i\in\mathcal{N}$ , and consider the following iteration,

[TABLE]

Obviously, the result can be obtained in a finite number of iterations since it is simply broadcasting the maximum value along the directed path. The number of iterations needed is no greater than the longest path in the graph, while the longest path is no greater than the total number of nodes. Thus, the condition in Theorem 2 should be easier to verify than the one in Theorem 1 in a distributed sense.

Note that applying a time-varying $\sigma(t)$ can be beneficial. When the graph is strongly connected, all agents should select an identical coupling gain $\sigma$ to ensure optimality. However, when the graph $\mathcal{G}(t)$ is not strongly connected at some time $t$ , it is hard to communicate and obtain an identical $\sigma$ for all agents. In this case, we can still use (19) to choose different coupling gains for agents in different disjoint subgraphs without affecting convergence to the optimal point. Suppose that the node set $\mathcal{N}$ consists of $q(t)\geq 1$ isolated subsets and let $\mathcal{N}^{k}(t)$ denote the $k$ th subset at time $t$ . At this time, by the weight-balanced property, all the disjoint subgraphs of $\mathcal{G}(t)$ are strongly connected respectively [19, 1]. Then each subgroup of agents at time $t$ is considered as an isolated system, and thus convergence is guaranteed by Theorem 2. Following similar lines of the proof of Theorem 2, we have $\sum_{i\in\mathcal{N}^{k}(t)}\lambda_{i}(t)=\sum_{i\in\mathcal{N}^{k}(t)}\lambda_{i}(0)$ and hence

[TABLE]

By Lemma 1 the optimality is preserved.

The above discussion is summarized as the following corollary.

Corollary 1.

Under Assumptions 1 and 2, the states of Algorithm 1 with initial condition $\sum_{i\in\mathcal{N}}\lambda_{i}(0)=\mathbf{0}$ will converge to the optimal point and solve problem (3) if the coupling gain $\sigma_{i}(t)$ for agent $i$ satisfies

[TABLE]

and $\sigma_{i}(t)=\sigma_{j}(t)$ , $\forall i,j\in\mathcal{N}^{k}(t)$ , for all $k=1,\ldots,q(t)$ .

IV Distributed Derivative Feedback Algorithm

Note that Algorithm 1 still depends on the global information $\max_{i}\{d^{i}(t)|\nu_{i}|\}$ . In this section, we propose a fully distributed derivative feedback algorithm based on the passivation of Algorithm 1.

IV-A Passivation And Derivative Feedback

The derivative feedback is widely used in distributed algorithms to ensure convergence or to modify algorithms for directed graphs [37, 17, 38, 8]. In this subsection, we design a new distributed algorithm and reveal that the input-feedforward passivation of IFPS agents through an internal feedforward loop is a form of derivative feedback.

Let us consider again each error subsystem $\Sigma_{i}$ in (11). Suppose $\Sigma_{i}$ is IFP( $\nu_{i}$ ), $\forall i\in\mathcal{N}$ , then we apply a passivation through feedforward of input. Define a new output as $\tilde{y}_{i}$ for the $i$ th subsystem

[TABLE]

where $\nu_{i}\leq 0$ is the IFP index of agent $i$ . The transformation is shown in Figure 1. Obviously, the transformed system $\tilde{\Sigma}_{i}$ is passive.

Lemma 3.

Under 1, each subsystem $\tilde{\Sigma}_{i}$ defined by (11) and (21), is passive from input $u_{i}$ to output $\tilde{y}_{i}$ with respect to the storage function (2).

Proof.

Adopt the same storage function (2), then following similar lines of the proof of Lemma 2, one has $V_{i}\geq 0$ and

[TABLE]

∎

Adopt the diffusive couplings of $\tilde{y}_{i}$ , $i\in\mathcal{N}$ as new inputs,

[TABLE]

then a novel distributed algorithm is constructed as follows.

By eliminating $\tilde{y_{i}}$ and $u_{i}$ with (24c) and (24b), respectively, Algorithm 2 can be rewritten in a compact form

[TABLE]

where $\nu=\text{diag}\{\nu_{1},\ldots,\nu_{N}\}\otimes I_{m}$ . We can observe that there exist derivative feedback terms in (25). Since each agent only requires information from neighboring agents, Algorithm 2 is a distributed algorithm.

Before proceeding to next step, note that the diffusive couplings of the new outputs bring algebraic loops [39, Section 8.3] into the overall closed-loop system. Thus, we have to check whether the feedback interconnection is well-posed.

The equation (25b) can be rewritten as

[TABLE]

Notice that $\left(I-\sigma(t)\mathbf{L}(t)\nu\right)$ should be nonsingular such that system (25) can be rewritten in the following explicit form, ensuring the well-posedness of the feedback interconnection [40].

[TABLE]

When the IFP indices are the same, e.g., $\nu_{i}=\bar{\nu}$ , $i\in\mathcal{N}$ , where $\bar{\nu}$ was defined in Theorem 1, then the nonsingularity of $\left(I-\sigma(t)\bar{\nu}\mathbf{L}(t)\right)$ is obvious following a matrix decomposition. However, when the IFP indices take different values, more analysis of this term is needed. We propose the following lemma.

Lemma 4.

The matrix $\left(I-\sigma(t)\mathbf{L}(t)\nu\right)$ is nonsingular.

Proof.

See the appendix. ∎

IV-B Algorithm Over UJSC Balanced Topologies

Next, we derive the following theorem stating that Algorithm 2 is fully distributed without global coordination.

Theorem 4.

Under Assumptions 1 and 2, the states of Algorithm 2 with initial condition $\sum_{i\in\mathcal{N}}\lambda_{i}(0)=\mathbf{0}$ will converge to the optimal point and solve problem (3) given any coupling gain $\sigma(t)>0$ .

Proof.

When $\dot{\lambda}=\mathbf{0}$ , system (25) reduces to system (8), meaning that the derivative term does not affect the equilibrium set of system (7). Besides, given the initial condition $\sum_{i\in\mathcal{N}}\lambda_{i}(0)=\mathbf{0}$ ,

[TABLE]

where the third equality follows from rules of the Kronecker product and the initial condition, the last follows from $\mathbf{1}_{N}^{T}L(\tau)=\mathbf{0}$ . It can also be shown by using the explicit expression (27b) of $\dot{\lambda}$ that $(\mathbf{1}_{N}\otimes I_{m})^{T}\lambda=\mathbf{0}$ , satisfying Lemma 1. Thus, the equilibrium point of Algorithm 2 with initial condition $\sum_{i\in\mathcal{N}}\lambda_{i}(0)=\mathbf{0}$ is still the optimal point to the distributed optimization problem (3).

The information of $(x_{i}^{*},\lambda_{i}^{*})$ is not required for exchange. Then Algorithm 2 can be implemented by output feedback interconnections of virtual agents $\tilde{\Sigma}_{i},~{}\forall i\in\mathcal{N}$ . Since $\tilde{\Sigma}_{i}$ is passive from input $u_{i}$ to output $\tilde{y}_{i}$ by Lemma 3, the consensus analysis among passive agents is similar to that among IFP agents with IFP indices being zero. Specifically, let $V=\sum_{i\in\mathcal{N}}V_{i}$ , where $V_{i}$ was defined in (2). Substituting (23) into (22), we obtain

[TABLE]

where $\tilde{y}=\text{col}(\tilde{y}_{1},\ldots,\tilde{y}_{N})$ and $u=\text{col}(u_{1},\ldots,u_{N})=-\sigma(t)\mathbf{L}(t)\tilde{y}$ and the last inequality follows from the fact that $\mathbf{L}(t)\geq 0$ . Following similar lines of the proof of Theorem 2, the states of Algorithm 2 with initial condition $\sum_{i\in\mathcal{N}}\lambda_{i}(0)=\mathbf{0}$ will asymptotically converge to the optimal point. ∎

Similarly, Theorem 4 can be directly applied to fixed weight-balanced strongly connected digraphs as a special case of UJSC topologies, as stated in the following corollary.

Corollary 2.

Suppose the communication digraph $\mathcal{G}$ is fixed, strongly connected and weight-balanced. Then, under 1, the states of Algorithm 2 with initial condition $\sum_{i\in\mathcal{N}}\lambda_{i}(0)=\mathbf{0}$ will converge to the optimal point for any $\sigma(t)>0$ .

IV-C Discussion on the Derivative Feedback Algorithm

Though Algorithm 2 requires agents to exchange with each other more information like derivatives of states, its advantages are significant.

Compared with most works on directed topologies, Algorithm 2 is robust against randomly changing weight-balanced digraphs with any positive coupling gain and independent of any global information. Since the time interval $T$ for UJSC graphs is not used in the proofs, $\mathcal{G}(t)$ can be relaxed to being strongly connected in a probabilistic sense, namely, it is applicable over gossip-like balanced digraphs [28]. It can be observed from Figure 1 that this modified algorithm can be easily realized by adding a local input feedforward loop to each subsystem $\Sigma_{i}$ . Since the input $u_{i}$ of the $i$ th virtual agent is the same as the input of the real agent $i$ , the input feedforward of virtual agents is actually the same as the input feedforward of real agents. Also, note that the passivation is achieved by each agent locally, no global information is needed beforehand. Thus, Algorithm 2 is a fully distributed algorithm.

It might be difficult to characterize the exact convergence rate of Algorithm 2 due to the existence of the derivative terms. Nevertheless, we will show in a numerical example in Section V that its empirical convergence rate is similar to the one of Algorithm 1.

V Numerical Examples

Example 1

We present a numerical example to show effects of the proposed algorithms over directed and switching topologies. Consider a network of $4$ agents possessing the following local objective functions $f_{i}:\mathbb{R}\rightarrow\mathbb{R},~{}i=1,2,3,4$ , respectively.

[TABLE]

By calculation, we obtain $\mu_{1}=l_{1}=0.8$ ; $\mu_{2}=1.20$ , $l_{2}=1.36$ ; $\mu_{3}=1$ , $l_{3}=3$ ; $\mu_{4}=1.76$ , $l_{4}=3.8$ . Let $\alpha=\beta=\gamma=1$ . Then, we obtain that each subsystem in (11) is IFPS with $\nu_{1}=-0.31$ , $\nu_{2}=-0.49$ , $\nu_{3}=-1$ , and $\nu_{4}=-0.68$ . Next, we consider two cases of topologies.

Case 1: the agents are connected through a ring graph that is strongly connected and weight-balanced, as shown in Figure 2.

Case 2: for every $0.1$ second, the graph $\mathcal{G}(t)$ switches randomly among three modes as shown in Figure 3.

The threshold coupling gains are obtained as $\sigma_{e}=0.50$ in (16) for both cases. We implement Algorithms 1 and 2 using the solver ode45 with auto-adjust variable stepsize in MATLAB over these two cases, and with $x_{i}(0)\in[0,1]$ , $\lambda(0)=\mathbf{0}$ satisfying the initial condition. Let $\sigma(t)=0.35+0.1\cos(t)<\sigma_{e}$ in Case 1. To illustrate Corollary 1, we adopt different coupling gains for different disjoint subgraphs in Case 2. Specifically, let $\sigma_{1}(t)=\sigma_{2}(t)=0.3+\sin(t)$ and $\sigma_{3}(t)=\sigma_{4}(t)=0.35+\cos(t)$ for Mode 1; $\sigma_{2}(t)=\sigma_{3}(t)=0.3+\sin(t)$ and $\sigma_{1}(t)=\sigma_{4}(t)=0.35+\cos(t)$ for Mode 2; $\sigma_{2}(t)=\sigma_{4}(t)=0.3+\sin(t)$ and $\sigma_{1}(t)=\sigma_{3}(t)=0.35+\cos(t)$ for Mode 3.

The state trajectories are shown in Figures 4, 5 and 6. It can be observed that the trajectories of $x_{i}$ , $i=1,2,3,4$ asymptotically converge to the optimal solution $x_{i}^{*}=0.1601,~{}i=1,2,3,4$ , in both cases. The residuals $\sum_{i=1}^{4}\|x_{i}-x_{i}^{*}\|$ of both algorithms over the time-invariant graph in Case 1 are shown in Figure 6. We can observe that Algorithm 2 has a convergence rate very similar to Algorithm 1 despite the existence of derivative terms, and the residuals of both algorithms decrease exponentially.

Example 2

We present another example to compare the two algorithms. Consider a network of $4$ agents interconnected through the same graph as Figure 2. The local objective functions are

[TABLE]

Let $\alpha=\beta=\gamma=1$ . By solving the LMI in [33, Lemma 2] with the YALMIP Toolbox [41], we obtain that the agents are IFPS with $\nu_{1}=-89.96$ , $\nu_{2}=-37.77$ , $\nu_{3}=-20.00$ and $\nu_{4}=-12.00$ . Then by (16), the coupling gain threshold is obtained as $\sigma_{e}=0.0056$ . According to 2, when $\sigma<\sigma_{e}$ , the trajectories of Algorithm 1 will converge to the optimal point. Then, we implement the two distributed algorithms using the solver ode45 with auto-adjust variable stepsize in MATLAB, and with $x_{i}(0)\in[2,3]$ , $\lambda(0)=\mathbf{0}$ satisfying the initial condition. The trajectories of the two algorithms asymptotically converge to the optimal solution $x_{i}^{*}=2.857,~{}i=1,2,3,4$ when $\sigma=0.005\in(0,\sigma_{e})$ , as shown in Figure 7.

When the coupling gain is outside the feasibility range, Algorithm 1 is not guaranteed to converge. The error system (11) is a linear system:

[TABLE]

where $F=\text{diag}\{0.1,0.15,0.2,0.25\}$ . Clearly, it is unstable when $\sigma\in[0.1,0.14]$ , which accords with our discussion in Section III-D. On the other hand, Algorithm 2 should be valid with any positive $\sigma$ by Theorem 4. To show this, we compare the two distributed algorithms with the same settings except $\sigma=0.1\notin(0,\sigma_{e})$ . It can be observed from Figure 8 that Algorithm 1 is unstable while the trajectories of $x_{i}$ , $\forall i\in\mathcal{N}$ in Algorithm 2 asymptotically converge to the optimal solution. Here the IFP indices are different and agents are passivated locally in Algorithm 2.

Example 3

We present an example of $N=100$ agents to show the scalability of the proposed algorithms. The local objective functions are $f_{i}(\mathrm{x})=b_{i}\mathrm{x}^{2}+c_{i}\mathrm{x}$ , $\forall i\in\mathcal{N}$ , where $b_{i}\in[0.5,0.6]$ , $c_{i}\in[-1,0]$ are randomly generated with uniform distributions. The weight-balanced digraph $\mathcal{G}(t)$ is also randomly generated per second, where the probability that the edge $(i,j)\in\mathcal{E}$ exists is $0.005$ , $\forall i\neq j,~{}i,j\in\mathcal{N}$ , and the in/out-degree $d^{i}$ is no greater than $\bar{d}=2.5$ , $\forall i\in\mathcal{N}$ . Let $\alpha=\gamma=\beta=1$ , then by calculation, the IFP indices satisfy $\nu_{i}\geq-2$ , $\forall i\in\mathcal{N}$ . Thus, we can roughly obtain $\sigma\leq 0.1$ for Algorithm 1. For Algorithm 2, there is no restriction on the coupling gain, which we set as $\sigma=1$ . We implement the two algorithms using ode45 in MATLAB with $x_{i}(0)\in[0,1]$ , $\lambda(0)=\mathbf{0}$ , and obtain Figure 9. It can be observed that the proposed algorithms have good scalability.

Note that an important feature of the algorithms is that the graph Laplacian $L(t)$ at any time can be very sparse. The time interval $T$ in 2 is imposed only to ensure the convergence performance and was not used in the proofs. In practice, our results hold as long as the graph $\mathcal{G}(t)$ is strongly connected in a probabilistic sense. In this example, $L(t)$ usually has zero eigenvalue with a multiplicity greater than $20$ , which greatly reduces communication.

VI Conclusion

This paper has investigated a distributed optimization problem via input feedforward passivity. An input-feedforward-passivity framework has been adopted to construct a distributed algorithm that is applicable over weight-balanced digraphs. Moreover, a novel distributed derivative feedback algorithm, which is fully distributed, has been proposed via the input-feedforward passivation. The proposed algorithms have been studied over directed and uniformly jointly strongly connected balanced topologies. Convergence conditions of a suitable coupling gain for the IFP-based distributed algorithm have been derived, while it has been shown that the distributed derivative feedback algorithm is robust against randomly changing weight-balanced digraphs with any positive coupling gain and without knowing any global information.

It is worth mentioning that there are also some limitations in this work and several directions can be considered in future work. For instance, requiring continuous communication is difficult in practice, which can be resolved by applying discrete-time communication or discretization [42]. Also, one could extend the IFP-based distributed algorithms to solve constrained problems or enhance robustness. Lastly, one could consider relaxing the strong convexity requirement using more advanced feedback techniques.

Acknowledgments

The authors would like to thank the Associate Editor and the Reviewers for their valuable comments. The first author would also like to thank Shunya Yamashita and Prof. Takeshi Hatanaka from Tokyo Tech for their useful suggestions.

Appendix

-A Proof of Lemma 2

Under 1, one has $\nabla f_{i}(x_{i})-\nabla f_{i}(x_{i}^{*})=B_{x_{i}}\left(x_{i}-x_{i}^{*}\right)$ , where

[TABLE]

is a positive definite matrix satisfying [31, Lemma 1]

[TABLE]

Clearly, $B_{x_{i}}$ is invertible and $B_{x_{i}}^{-1}$ is also positive definite. Then, the $i$ th subsystem in (11) can be written as

[TABLE]

Since $\dot{x}_{i}^{*}=\dot{\lambda}_{i}^{*}\equiv\mathbf{0}$ , one has $\dot{x}_{i}=\Delta\dot{x}_{i}$ and $\dot{\lambda}_{i}=\Delta\dot{\lambda}_{i}$ . Denote $z_{i}=\alpha\left(\nabla f_{i}(x_{i})-\nabla f_{i}(x_{i}^{*})\right)+\Delta\lambda_{i}$ , or equivalently,

[TABLE]

Let us consider the storage function

[TABLE]

where $\eta_{i}$ is a positive parameter such that $\eta_{i}>\frac{1}{\mu_{i}\alpha\gamma}$ . By the strong convexity of $f_{i}$ , one has

[TABLE]

Then, by substituting the above inequality into $V_{i}$ , one gets

[TABLE]

where $R_{i}=\begin{bmatrix}\frac{\eta_{i}}{2}I+\frac{\mu_{i}B_{x_{i}}^{-2}}{2\alpha\gamma}-\frac{B_{x_{i}}^{-1}}{\alpha\gamma}&\frac{\eta_{i}}{2}I-\frac{B_{x_{i}}^{-1}}{2\alpha\gamma}\\ *&\frac{\eta_{i}}{2}I\end{bmatrix}$ . By the Schur complement [43, Proposition 8.2.4], $R_{i}>0$ if and only if $\frac{\eta_{i}}{2}>0$ and

[TABLE]

Select $\eta_{i}$ such that $\eta_{i}>\frac{1}{\mu_{i}\alpha\gamma}$ , then the above inequality holds, and $R_{i}>0$ . Hence, $V_{i}>0$ and $V_{i}=0$ if and only if $(x_{i},\lambda_{i})=(x_{i}^{*},\lambda_{i}^{*})$ .

It follows from the gradient of $f_{i}(x_{i})$ that

[TABLE]

where

[TABLE]

Then the derivative of $V_{i}$ satisfies

[TABLE]

where the first inequality follows from the strong convexity of $f_{i}$ , the last inequality follows from the inequality of arithmetic and geometric means and $\eta_{i}>\frac{1}{\mu_{i}\alpha\gamma}$ , and $\nu_{i}=-\frac{\|g_{i}\|^{2}}{4\left(\mu_{i}\eta_{i}\alpha-\frac{1}{\gamma}\right)}\leq 0$ . Since parameters in $g_{i}$ and $\nabla^{2}f_{i}(x_{i})$ are bounded, given finite $\eta_{i}$ , a constant $\nu_{i}$ can be obtained. Thus, the subsystem $\Sigma_{i}$ is IFP( $\nu_{i}$ ).

-B Proof of Theorem 3

Consider the case where $\beta>0$ . Adopt the Lyapunov function candidate $V_{e}=(1-\delta)\sum_{i\in\mathcal{N}}V_{i}+\frac{\delta}{2\beta}\left\|\Delta x\right\|^{2}$ , where $V_{i}$ was defined in (2) and $0<\delta<1$ is to be decided.

Let us look at the storage function $V_{i}$ again. It has been proven in the proof of Lemma 2 that

[TABLE]

where $R_{i}=\begin{bmatrix}\frac{\eta_{i}}{2}I+\frac{\mu_{i}B_{x_{i}}^{-2}}{2\alpha\gamma}-\frac{B_{x_{i}}^{-1}}{\alpha\gamma}&\frac{\eta_{i}}{2}I-\frac{B_{x_{i}}^{-1}}{2\alpha\gamma}\\ *&\frac{\eta_{i}}{2}I\end{bmatrix}$ . Denote

[TABLE]

To obtain the smallest eigenvalue of $R_{i}$ , let us solve $\text{det}(R_{i}-sI)=0$ . By the Schur complement [43, Proposition 8.2.3], it is equivalent to solving

[TABLE]

where $\text{det}\left(\left(\frac{\eta_{i}}{2}-s\right)I\right)\neq 0$ in the above since $\frac{\eta_{i}}{2}$ is not an eigenvalue to $R_{i}$ . Notice that $0\leq\frac{1}{l_{i}}I\leq B_{x_{i}}^{-1}\leq\frac{1}{\mu_{i}}I$ by (29), then there exists an invertible matrix $T_{x_{i}}\in\mathbb{R}^{m\times m}$ such that

[TABLE]

where $\Lambda_{x_{i}}\in\mathbb{R}^{m\times m}$ is a diagonal matrix. Then (-B) becomes

[TABLE]

where $\frac{1}{l_{i}}\leq r_{j}\leq\frac{1}{\mu_{i}}$ is the $j$ th diagonal element of $\Lambda_{x_{i}}$ . Since $\eta_{i}+\frac{\mu_{i}r_{j}^{2}}{2\alpha\gamma}-\frac{r_{j}}{\alpha\gamma}>0$ and $\mu_{i}\eta_{i}-\frac{1}{\alpha\gamma}>0$ by Lemma 2, the roots $s_{j}^{\pm}$ , $\forall j$ to (35) are positive. Solving (35), we obtain

[TABLE]

Denote $s_{j}^{-}(r_{j})$ as a function of $r_{j}$ with an abuse of notation. The smallest eigenvalue of $R_{i}$ satisfies

[TABLE]

then $V_{i}\geq\varepsilon_{i}\left\|\begin{smallmatrix}\alpha B_{x_{i}}\Delta x_{i}\\ \Delta\lambda_{i}\end{smallmatrix}\right\|^{2}.$

Next, let us derive the upper bound of $V_{i}$ . By the strong convexity,

[TABLE]

Substituting the above inequality into $V_{i}$ , we have

[TABLE]

where $M_{i}=\begin{bmatrix}\frac{\eta_{i}}{2}I-\frac{\mu_{i}B_{x_{i}}^{-2}}{2\alpha\gamma}&\frac{\eta_{i}}{2}I-\frac{B_{x_{i}}^{-1}}{2\alpha\gamma}\\ *&\frac{\eta_{i}}{2}I\end{bmatrix}$ . Denote the matrix

[TABLE]

By similar application of the Schur complement [43, Proposition 8.2.4], we can obtain that $\eta_{i}I-M_{i}\geq 0$ .

Let $\eta_{i}=\frac{2}{\mu_{i}\alpha\gamma}$ , satisfying Lemma 2, then $M_{i}\leq\frac{2}{\mu_{i}\alpha\gamma}I$ . Moreover, $s_{j}^{-}(r_{j})$ is monotonically increasing with respect to $r_{j}$ . Thus, (36) leads to

[TABLE]

Denote $\mu=\min_{i\in\mathcal{N}}\{\mu_{i}\}$ , $l=\max_{i\in\mathcal{N}}\{l_{i}\}$ and $B_{x}=\text{diag}\left\{B_{x_{1}},\ldots,B_{x_{N}}\right\}$ . It satisfies that $\nabla f(x)-\nabla f(x^{*})=B_{x}\Delta x$ and $0\leq\left\|\Delta x\right\|^{2}\leq\frac{1}{\alpha^{2}\mu^{2}}\left\|\alpha B_{x}\Delta x\right\|^{2}$ due to (29). By the definition of $V_{e}$ , we obtain

[TABLE]

Let $\eta_{i}=\frac{2}{\mu_{i}\alpha\gamma}$ , then $\nu_{i}=-\frac{\gamma}{4}\max_{x_{i}}\{\|g_{i}\|^{2}\}$ by (13), where $g_{i}$ was defined in (33). Since $\nu_{i}\leq 0$ by Lemma 2, let us assume without loss of generality that $\max_{i}\{d^{i}|\nu_{i}|\}\neq 0$ . Choose a constant $\rho\in\left(0,\frac{\delta\max_{i\in\mathcal{N}}\{|\nu_{i}|d^{i}\}}{\max_{i\in\mathcal{N}}\{d^{i}\}}\right)$ , where $d^{i}$ is the in/out-degree, and then denote

[TABLE]

Substituting $\eta_{i}=\frac{2}{\mu_{i}\alpha\gamma}$ into (33), the time derivative of $(1-\delta)V$ satisfies

[TABLE]

where $z=\text{col}(z_{1},\ldots,z_{N})$ , the second inequality follows from (38) and the inequality of arithmetic and geometric means similarly to (33), and the last inequality follows from the definition of $\nu_{i}$ . The above manipulation provides a term containing $\|z\|^{2}$ in order to prove negative semi-definiteness of $\dot{V}_{e}$ later.

Therefore, the time derivative of $V_{e}$ satisfies

[TABLE]

Replacing $\nu_{i}$ by $\left((1-\delta)\nu_{i}-\rho\right)$ in (17), we have

[TABLE]

where

[TABLE]

and $\varphi>0$ by the definition of $\rho$ and condition (16).

Let us define $\bar{x}$ as the stacked vector of the average value of $x_{i}$ , i.e., $\bar{x}:=\mathbf{1}_{N}\otimes\left(\frac{1}{N}\left(\mathbf{1}_{N}\otimes I_{m}\right)^{T}x\right)$ . Observe that for any vector $v\in\mathbb{R}^{m}$ , $\left(\mathbf{1}_{N}\otimes v\right)^{T}\left(x-\bar{x}\right)=0$ . In addition, $\left(\mathbf{1}_{N}\otimes v\right)^{T}(\mathbf{L}^{T}+\mathbf{L})=\mathbf{1}_{N}^{T}(L^{T}+L)\otimes v=\mathbf{0}$ , which implies that $\left(x-\bar{x}\right)$ is orthogonal to all eigenvectors of $\mathbf{L}^{T}+\mathbf{L}$ associated with zero eigenvalues. Consequently, we have

[TABLE]

where the equality follows from $\mathbf{1}_{N}^{T}{L}=\mathbf{0}$ , ${L}\mathbf{1}_{N}=\mathbf{0}$ , and $s_{2}:=s_{2}\left(L+L^{T}\right)$ is the smallest nonzero eigenvalue of $L+L^{T}$ .

We can also observe that

[TABLE]

where the second equality follows from the Kronecker product and the last equality is due to (III-C). Consequently,

[TABLE]

where the first inequality follows from (40), (-B), the second inequality follows from the Young’s inequality with $\theta>0$ , the equality follows from (30), and the last inequality follows from (29). Observe that $\dot{V}_{e}$ is negative definite if $Q>0$ and $\left(\varphi s_{2}-\frac{\delta\theta}{2\beta}\right)>0$ , i.e., the following conditions hold,

[TABLE]

Choose $\theta=\alpha l$ , then the above conditions become $\delta<\alpha\beta l\phi$ and $\delta<\frac{2\beta\varphi s_{2}}{\alpha l}$ . Though $\phi(\delta)$ , $\varphi(\delta)$ are functions of $\delta$ , it is obvious that $\phi(\delta),\varphi(\delta)>0$ when $\delta\rightarrow 0$ . Then there always exists a small enough $\delta\in\left(0,\min\left\{\alpha\beta l\phi,\frac{2\beta\varphi s_{2}}{\alpha l}\right\}\right)$ such that the above conditions are satisfied and $\dot{V_{e}}$ is negative definite.

Next, by calculations, $\dot{V_{e}}\leq-s_{1}(Q)\left\|\begin{smallmatrix}\alpha B_{x}\Delta x\\ \Delta\lambda\end{smallmatrix}\right\|^{2}$ , where $s_{1}(Q)=\frac{2\phi+\frac{\delta}{2\alpha\beta l}-\sqrt{4\phi^{2}+\frac{9\delta^{2}}{4\alpha^{2}\beta^{2}l^{2}}}}{2}>0$ is the smallest eigenvalue of $Q$ . Then, by the exponential stability theorem [29, Theorem 4.10], we have $V_{e}(t)\leq V_{e}(0)e^{-\epsilon t}$ , where

[TABLE]

due to (-B), and

[TABLE]

Recall that $\left\|\Delta x\right\|\leq\frac{1}{\alpha\mu}\left\|\alpha B_{x}\Delta x\right\|\leq\frac{l}{\mu}\left\|\Delta x\right\|$ and $B_{x}\Delta x=\mathbf{0}$ if and only if $\Delta x=\mathbf{0}$ due to (29). Then we finally obtain

[TABLE]

for any $t\geq 0$ .

The cases for $\beta<0$ and $\beta=0$ can be proved similarly by taking $V_{e}=(1+\delta)V-\frac{\delta}{2\beta}\left\|\Delta x\right\|^{2}$ and $\dot{V}_{e}=V+\frac{\delta}{2}\left\|\Delta x\right\|^{2}$ , respectively.

-C Proof of Lemma 4

To prove the nonsingularity, we first give some lemmas as follows.

Lemma 5.

Given a real matrix $Q\in\mathbb{R}^{m\times m}$ , the matrix $(I+Q)$ is invertible if and only if $-1$ is not an eigenvalue of $Q$ .

Proof.

$(I+Q)$ is invertible if and only if $\det(I+Q)\neq 0$ . In other words, the characteristic polynomial $p_{-Q}(s)$ at $1$ is nonzero, i.e., $p_{-Q}(1)=\det(I+Q)\neq 0$ . Since eigenvalues are the roots of the characteristic polynomial, it means that $1$ is not an eigenvalue of $-Q$ , or equivalent, $-1$ is not an eigenvalue of $Q$ . ∎

Lemma 6.

Let $L\in\mathbb{R}^{N\times N}$ be the Laplacian matrix for a weight-balanced graph $\mathcal{G}$ , $M\in\mathbb{R}^{N\times N}$ be a diagonal matrix and $M\geq 0$ , then the eigenvalues of $ML$ or $LM$ have non-negative real parts.

Proof.

Let us consider the case of $ML$ . Since $M$ is diagonal, by direct calculation,

[TABLE]

Since $M\geq 0$ and $\mathcal{G}$ is weight-balanced, we have $M_{ii}L_{ii}=\sum_{j\neq i}\left|M_{ii}L_{ij}\right|$ for all $i$ , implying that $ML$ is diagonal dominant. By the Gershgorin circle theorem [43, Fact 4.10.16], the real parts of the eigenvalues of $ML$ remain non-negative. The case of $LM$ can be proved similarly. ∎

We are now ready to prove the nonsingularity.

Take $m=1$ without loss of generality. Recall that $\nu_{i}\leq 0$ , $\forall i\in\mathcal{N}$ by Lemma 2. The eigenvalues of $-\mathbf{L}(t)\nu$ have non-negative real parts by Lemma 6. In addition, $\sigma(t)>0$ , thus $-1$ is not an eigenvalue of $-\sigma(t)\mathbf{L}(t)\nu$ . By Lemma 5, $\left(I-\sigma(t)\mathbf{L}(t)\nu\right)$ is invertible and hence nonsingular.

Bibliography43

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] M. Li, G. Chesi, and Y. Hong, “Input-feedforward-passivity-based distributed optimization over directed and switching topologies,” in 2019 IEEE 58th Conference on Decision and Control (CDC) . IEEE, 2019, pp. 6056–6061.
2[2] A. Nedić and A. Olshevsky, “Distributed optimization over time-varying directed graphs,” IEEE Transactions on Automatic Control , vol. 60, no. 3, pp. 601–615, 2015.
3[3] A. Nedić, A. Olshevsky, and W. Shi, “Achieving geometric convergence for distributed optimization over time-varying graphs,” SIAM Journal on Optimization , vol. 27, no. 4, pp. 2597–2633, 2017.
4[4] P. Xie, K. You, R. Tempo, S. Song, and C. Wu, “Distributed convex optimization with inequality constraints over time-varying unbalanced digraphs,” IEEE Transactions on Automatic Control , vol. 63, no. 12, pp. 4331–4337, 2018.
5[5] J. Wang and N. Elia, “Control approach to distributed optimization,” in 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton) . IEEE, 2010, pp. 557–561.
6[6] P. Yi, Y. Hong, and F. Liu, “Distributed gradient algorithm for constrained optimization with application to load sharing in power systems,” Systems & Control Letters , vol. 83, pp. 45–52, 2015.
7[7] M. Li, “Generalized lagrange multiplier method and KKT conditions with an application to distributed optimization,” IEEE Transactions on Circuits and Systems II: Express Briefs , vol. 66, no. 2, pp. 252–256, 2019.
8[8] X. Zeng, P. Yi, Y. Hong, and L. Xie, “Distributed continuous-time algorithms for nonsmooth extended monotropic optimization problems,” SIAM Journal on Control and Optimization , vol. 56, no. 6, pp. 3973–3993, 2018.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Input-Feedforward-Passivity-Based Distributed Optimization Over Jointly Connected Balanced Digraphs

Abstract

Index Terms:

I Introduction

II Preliminaries and Problem Formulation

II-A Notation

II-B Convex Analysis

II-C Graph Theory

II-D Passivity

Definition 1**.**

II-E Problem Formulation

Assumption 1**.**

Assumption 2**.**

III IFP-Based Distributed Algorithm

III-A IFP-Based Distributed Algorithm

Remark 1**.**

Lemma 1**.**

Proof.

III-B Input Feedforward Passivity of the Error System

Lemma 2**.**

Proof.

Remark 2**.**

III-C Algorithm Over UJSC Balanced Topologies

Definition 2**.**

Theorem 1** ([1]).**

Theorem 2**.**

Proof.

Theorem 3**.**

Proof.

Remark 3**.**

III-D Discussion on the coupling gain

Corollary 1**.**

IV Distributed Derivative Feedback Algorithm

IV-A Passivation And Derivative Feedback

Lemma 3**.**

Proof.

Lemma 4**.**

Proof.

IV-B Algorithm Over UJSC Balanced Topologies

Theorem 4**.**

Proof.

Corollary 2**.**

IV-C Discussion on the Derivative Feedback Algorithm

V Numerical Examples

Example 1

Example 2

Example 3

VI Conclusion

Acknowledgments

Appendix

-A Proof of Lemma 2

-B Proof of Theorem 3

-C Proof of Lemma 4

Lemma 5**.**

Proof.

Lemma 6**.**

Proof.

Definition 1.

Assumption 1.

Assumption 2.

Remark 1.

Lemma 1.

Lemma 2.

Remark 2.

Definition 2.

Theorem 1 ([1]).

Theorem 2.

Theorem 3.

Remark 3.

Corollary 1.

Lemma 3.

Lemma 4.

Theorem 4.

Corollary 2.

Lemma 5.

Lemma 6.