Distributed Optimization of Multi-Beam Directional Communication   Networks

Theodoros Tsiligkaridis

arXiv:1706.02211·math.OC·June 8, 2017·ICNC

Distributed Optimization of Multi-Beam Directional Communication Networks

Theodoros Tsiligkaridis

PDF

TL;DR

This paper develops a distributed optimization framework for multi-beam airborne networks to maximize data rates, employing convex programming and augmented Lagrangian methods, demonstrating fast convergence and improved performance over traditional routing.

Contribution

It introduces a convex multi-commodity flow formulation and a distributed augmented Lagrangian algorithm for optimizing multi-beam directional communication networks.

Findings

01

Fast convergence compared to primal-dual methods

02

Significant performance gains over minimum distance routing

03

Effective handling of intra-network rate demands

Abstract

We formulate an optimization problem for maximizing the data rate of a common message transmitted from nodes within an airborne network broadcast to a central station receiver while maintaining a set of intra-network rate demands. Assuming that the network has full-duplex links with multi-beam directional capability, we obtain a convex multi-commodity flow problem and use a distributed augmented Lagrangian algorithm to solve for the optimal flows associated with each beam in the network. For each augmented Lagrangian iteration, we propose a scaled gradient projection method to minimize the local Lagrangian function that incorporates the local topology of each node in the network. Simulation results show fast convergence of the algorithm in comparison to simple distributed primal dual methods and highlight performance gains over standard minimum distance-based routing.

Figures4

Click any figure to enlarge with its caption.

Equations95

{x_{ij} (m), P_{ij}, P_{i, C}} max i \in N \sum P_{i, C} f_{i, C}

{x_{ij} (m), P_{ij}, P_{i, C}} max i \in N \sum P_{i, C} f_{i, C}

subject to m = 1 \sum M x_{ij} (m) \leq c_{ij} (P_{ij}), \forall (i, j) \in E

j : (i, j) \in E \sum x_{ij} (m) - j : (j, i) \in E \sum x_{j i} (m) = ⎩ ⎨ ⎧ R_{m}, - R_{m}, 0, if i = i_{m} if i = j_{m} otherwise,

\forall i \in N, \forall m \in M

j : (i, j) \in E \sum P_{ij} + P_{i}^{J} \leq P_{ma x}, \forall i \in N

x_{ij} (m) \geq 0, P_{ij} \geq 0, P_{i}^{J} \geq 0

(P) {x_{ij} (m), P_{ij}} min (i, j) \in E \sum P_{ij} f_{i, C}

(P) {x_{ij} (m), P_{ij}} min (i, j) \in E \sum P_{ij} f_{i, C}

subject to m = 1 \sum M x_{ij} (m) - c_{ij} (P_{ij}) \leq 0

j : (j, i) \in E \sum x_{j i} (m) - j : (i, j) \in E \sum x_{ij} (m) + s_{i} (m) = 0

j : (i, j) \in E \sum P_{ij} - P_{ma x} \leq 0

x_{ij} (m) \geq 0, P_{ij} \geq 0

s_{i} (m) = def ⎩ ⎨ ⎧ R_{m}, - R_{m}, 0, if i = i_{m} if i = j_{m} otherwise

s_{i} (m) = def ⎩ ⎨ ⎧ R_{m}, - R_{m}, 0, if i = i_{m} if i = j_{m} otherwise

m = 1 \sum M x_{ij}^{*} (m) = c_{ij} (P_{ij}^{*}), \forall (i, j) \in E

m = 1 \sum M x_{ij}^{*} (m) = c_{ij} (P_{ij}^{*}), \forall (i, j) \in E

m = 1 \sum M x_{i^{'} j^{'}}^{*} (m) = c_{i^{'} j^{'}} (P_{i^{'} j^{'}}^{*}) - ϵ

m = 1 \sum M x_{i^{'} j^{'}}^{*} (m) = c_{i^{'} j^{'}} (P_{i^{'} j^{'}}^{*}) - ϵ

\tilde{P}_{i^{'}, j^{'}} = def c_{i^{'}, j^{'}}^{- 1} (m \sum x_{i^{'} j^{'}}^{*} (m))

\tilde{P}_{i^{'}, j^{'}} = def c_{i^{'}, j^{'}}^{- 1} (m \sum x_{i^{'} j^{'}}^{*} (m))

(Q) {x_{ij} (m)} min (i, j) \in E \sum w_{ij} exp (ln (2) m = 1 \sum M x_{ij} (m))

(Q) {x_{ij} (m)} min (i, j) \in E \sum w_{ij} exp (ln (2) m = 1 \sum M x_{ij} (m))

subject to j : (j, i) \in E \sum x_{j i} (m) - j : (i, j) \in E \sum x_{ij} (m) + s_{i} (m) = 0

x_{ij} (m) \geq 0

P_{ij}^{*} = def c_{ij}^{- 1} (m \sum x_{ij}^{*} (m)) = \frac{e ^{l n (2) \sum_{m} x_{ij}^{*} (m)} - 1}{f _{ij}}

P_{ij}^{*} = def c_{ij}^{- 1} (m \sum x_{ij}^{*} (m)) = \frac{e ^{l n (2) \sum_{m} x_{ij}^{*} (m)} - 1}{f _{ij}}

L (X, p) = (i, j) \in E \sum w_{ij} exp (ln (2) m \sum x_{ij} (m))

L (X, p) = (i, j) \in E \sum w_{ij} exp (ln (2) m \sum x_{ij} (m))

+ m \sum i \in N \sum p_{i} (m) j : (j, i) \in E \sum x_{j i} (m) - j : (i, j) \in E \sum x_{ij} (m) + s_{i} (m)

x_{ij} (m)^{k + 1}

x_{ij} (m)^{k + 1}

\displaystyle=\Bigg{[}x_{ij}(m)^{k}-\alpha\Big{(}\ln(2)w_{ij}e^{\ln(2)\sum_{m^{\prime}}x_{ij}(m^{\prime})^{k}}

\displaystyle\qquad\qquad+p_{j}(m)^{k}-p_{i}(m)^{k}\Big{)}\Bigg{]}_{+}

p_{i} (m)^{k + 1} = p_{i} (m)^{k} + α \frac{\partial L ( X , p )}{\partial p _{i} ( m )}

p_{i} (m)^{k + 1} = p_{i} (m)^{k} + α \frac{\partial L ( X , p )}{\partial p _{i} ( m )}

= p_{i} (m)^{k} + α j : (j, i) \in E \sum x_{j i} (m)^{k} - j : (i, j) \in E \sum x_{ij} (m)^{k} + s_{i} (m)

\overset{x}{^}_{ij} (m)^{(k)} = \frac{1}{k} x_{ij} (m)^{k} + (1 - \frac{1}{k}) \overset{x}{^}_{ij} (m)^{(k - 1)}

\overset{x}{^}_{ij} (m)^{(k)} = \frac{1}{k} x_{ij} (m)^{k} + (1 - \frac{1}{k}) \overset{x}{^}_{ij} (m)^{(k - 1)}

(Q’) {x_{ij} (m)} min i \in N \sum j : (i, j) \in E \sum w_{ij} 2^{\sum_{m = 1}^{M} x_{ij} (m)}

(Q’) {x_{ij} (m)} min i \in N \sum j : (i, j) \in E \sum w_{ij} 2^{\sum_{m = 1}^{M} x_{ij} (m)}

subject to i \in N \sum C_{i} x_{i} (m) = d (m), \forall m \in M

x_{i} (m) ⪰ 0, \forall i \in N, \forall m \in M

[C_{i}]_{l, k} = ⎩ ⎨ ⎧ + 1, - 1, 0, l = i, k \in N_{i} l = k, k \in N_{i} else

[C_{i}]_{l, k} = ⎩ ⎨ ⎧ + 1, - 1, 0, l = i, k \in N_{i} l = k, k \in N_{i} else

Λ_{i} (x_{i}; {x_{j}^{k}}_{j \in T_{i}}, {λ_{j}^{k} (m)}_{j \in N_{i} \cup {i}})

Λ_{i} (x_{i}; {x_{j}^{k}}_{j \in T_{i}}, {λ_{j}^{k} (m)}_{j \in N_{i} \cup {i}})

= j : (i, j) \in E \sum w_{ij} 2^{a_{j}^{T} x_{i}} + m \sum (λ (m)^{k})^{T} C_{i} x_{i} (m)

+ \frac{ρ}{2} m \sum ∥ C_{i} x_{i} (m) + j \neq = i \sum C_{j} x_{j} (m)^{k} - d (m) ∥_{2}^{2}

\hat{x}_{i}^{k} = ar g x_{i} ⪰ 0 min Λ_{i} (x_{i}; {x_{j}^{k}}_{j \in T_{i}}, {λ_{j}^{k} (m)}_{j \in N_{i}})

\hat{x}_{i}^{k} = ar g x_{i} ⪰ 0 min Λ_{i} (x_{i}; {x_{j}^{k}}_{j \in T_{i}}, {λ_{j}^{k} (m)}_{j \in N_{i}})

x_{i}^{k + 1} = x_{i}^{k} + τ (\hat{x}_{i}^{k} - x_{i}^{k})

x_{i}^{k + 1} = x_{i}^{k} + τ (\hat{x}_{i}^{k} - x_{i}^{k})

λ_{i}^{k + 1} (m) = λ_{i}^{k} (m)

λ_{i}^{k + 1} (m) = λ_{i}^{k} (m)

+ ρ τ j : (i, j) \in E \sum x_{ij}^{k + 1} (m) - j : (j, i) \in E \sum x_{j i}^{k + 1} (m) - d_{i} (m)

\overset{ˉ}{x}_{i} (t)

\overset{ˉ}{x}_{i} (t)

x (t + 1)

\displaystyle m(t)=\min\Big{\{}m\in{\mathbb{N}}:\Lambda_{i}({\mathbf{x}}_{i}(t))-

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Distributed Optimization of Multi-Beam Directional Communication Networks

Theodoros Tsiligkaridis

MIT Lincoln Laboratory

Lexington, MA 02141, USA

Email: [email protected]

Abstract

We formulate an optimization problem for maximizing the data rate of a common message transmitted from nodes within an airborne network broadcast to a central station receiver while maintaining a set of intra-network rate demands. Assuming that the network has full-duplex links with multi-beam directional capability, we obtain a convex multi-commodity flow problem and use a distributed augmented Lagrangian algorithm to solve for the optimal flows associated with each beam in the network. For each augmented Lagrangian iteration, we propose a scaled gradient projection method to minimize the local Lagrangian function that incorporates the local topology of each node in the network. Simulation results show fast convergence of the algorithm in comparison to simple distributed primal dual methods and highlight performance gains over standard minimum distance-based routing.

††This work is sponsored by Air Force Contract FA8721-05-C-0002. Opinions, interpretations, conclusions and recommendations are those of the author and are not necessarily endorsed by the Department of Defense or the U.S. Government. Approved for Public Release, Distribution Unlimited.

I Introduction

Missions where multiple communication goals are of interest are becoming more prevalent in military applications. Multilayer communications may occur within a coalition; for example, a team consisting of ground vehicles and an airborne set of assets may desire to maximize data rate for communication from the ground team to the airborne team and vice-versa while simultaneously maintaining various types of communications within each team. This paper considers such a particular scenario where a team of nodes capable of multi-beam directional communications wants to send a common message to another team, while maintaining various routing demands within pairs of nodes. The optimization variables are the powers allocated to each beam and correspond to flows across edges in the network.

Digital antenna arrays capable of adaptive multi-beam communications are getting closer to practice. Using transmit and receive beamforming, these arrays can form multiple links over the same frequency and may even steer nulls to mitigate interference from nearby nodes yielding higher spatial reuse and enhancing network throughput [1]. These advanced PHY techniques have to be coupled with a MAC layer. Recent prior work on this networking topic considered different MAC layer policies and proposed a new uncoordinated random access MAC policy for such systems and evaluated its throughput performance as a function of various parameters [2]. The recent work of [3] explored various spatial processing strategies for multi-beam transmission with a simple MAC layer in a simulation study. Full-duplex communications ease the burden on the MAC layer design and are projected to soon become a reality [4, 5] and can significantly increase network capacity [6] at the expense of carefully managing self-interference [7].

In this paper, we assume full-duplex capabilities and ideal multi-beam technology in order to focus on higher level issues of distributed optimization of resources for several tasks. We adopt a constrained optimization framework to balance the power tradeoff for maximizing data rate for a common message sent to a central station and maintaining data demands for routing packets within the network. The optimal flows that arise from the solution of the optimization problem can be used to route different messages across the network. We simplify this optimization problem into a convex multicommodity flow problem and solve using a distributed augmented Lagrangian (AL) decomposition technique. We derive a scaled gradient algorithm to minimize each local AL function at each node and rigorously show its implementation requires local computations. Simulations are performed on an illustrative example to highlight gains over standard minimum distance routing protocols, and also show convergence rate improvements over simple primal dual optimization methods.

II Problem Formulation

The blue team network is composed on $N$ nodes, and is modeled as a graph $\mathcal{G}=(\mathcal{N},\mathcal{E})$ . The graph may be undirected or directed. Node $i$ may transmit data to node $j$ if $(i,j)\in\mathcal{E}$ . We let $R_{m}$ denote a desired data rate for the traffic originating at a source node $i_{m}$ and ending at a sink node $j_{m}$ . Each of these data traffic demands must be satisfied within the blue team. Define $P_{ij}$ as the power allocated by node $i$ for transmitting data to node $j$ , and its associated channel capacity $c_{ij}(P_{ij})=\log_{2}(1+f_{ij}P_{ij})$ bits/s/Hz for a path loss constant $f_{ij}$ . Let $P_{\text{max}}$ denote the maximum transmit power of a particular node. Each node $i$ may allocated power for communications within its own team equal to $\sum_{j:(i,j)\in\mathcal{E}}P_{ij}$ , and power for communication to the central station, $P_{i,C}=P_{max}-\sum_{j:(i,j)\in\mathcal{E}}P_{ij}$ . We define $x_{ij}(m)$ as the information flow from node $i$ to $j$ for commodity $m$ , where the total number of commodities are $M$ . Each commodity here corresponds to a desired message to be sent from one node to the other.

The links are line-of-sight with path loss $f_{ij}=\frac{1}{N_{0}W(4\pi/\lambda)^{2}d_{ij}^{2}}$ , where $d_{ij}$ is the distance between nodes $i$ and $j$ , and $N_{0}=kT$ is the noise figure of the receiver. The path loss from node $i$ to the central station is denoted as $f_{i,C}$ and is similarly defined.

We seek to optimally allocate power among several transmit beams per node in order to maximize the total signal-to-interference noise ratio at the central station receiver since the data rate is given by $R_{C}=W\log_{2}(1+\sum_{i\in\mathcal{N}}P_{i,C}f_{i,C})$ bits/s. This framework may be generalized further to include multiple central stations. The optimization problem is as follows:

[TABLE]

This problem is equivalent to the primal optimization problem shown below by eliminating variables $\{P_{i,C}\}$ .

[TABLE]

where

[TABLE]

We note that the primal problem (P) is a convex optimization problem. For large enough $P_{max}$ and appropriate rates $R_{m}$ , Slater’s condition can be shown to hold. This implies strong duality and existence of optimal dual and primal solutions. We note however that there is no simple way to test if a solution exists to this problem for an arbitrary $P_{max}$ and desired rates $R_{m}$ .

Next, we reformulate the problem (P) by eliminating the variables $\{P_{ij}\}$ by making use of the following proposition.

Proposition 1.

Assume that an optimal solution to the convex optimization problem (P) exists and is given by $(X^{*},P^{*})$ . Then, we must have

[TABLE]

Proof.

Assume that (1) is violated. Then there exists an edge $(i^{\prime},j^{\prime})\in\mathcal{E}$ and $\epsilon>0$ such that

[TABLE]

Now, set

[TABLE]

By the continuity and monotonicity of the function $c_{i^{\prime}j^{\prime}}(\cdot)$ , $\tilde{P}_{i^{\prime}j^{\prime}}<P_{i^{\prime}j^{\prime}}^{*}$ . Define the new solution $(X^{*},\tilde{P}=\{\tilde{P}_{i^{\prime}j^{\prime}},\{P_{ij}^{*}\}_{(i,j)\neq(i^{\prime},j^{\prime})}\})$ . This solution also satisfies all the constraints, but achieves a better objective function value, i.e., $\tilde{P}_{i^{\prime}j^{\prime}}f_{i^{\prime},C}+\sum_{(i,j)\neq(i^{\prime},j^{\prime})}P_{ij}^{*}f_{i,C}<\sum_{(i,j)\in\mathcal{E}}P_{ij}^{*}f_{i,C}$ . This contradicts the optimality of $(X^{*},P^{*})$ . The proof is complete. ∎

Proposition 1 allows us to eliminate the power variables $P_{ij}$ and transform the problem (P) into a multi-commodity optimization problem:

[TABLE]

where $w_{ij}\stackrel{{\scriptstyle\rm def}}{{=}}\frac{f_{i,C}}{f_{ij}}$ are positive weights.

The problem (Q) is also a convex optimization problem since the constraints are convex and the objective is a sum of convex functions, each one being a composition of a convex function ( $\phi(u)=e^{u}$ ) with a linear combination of the variables ( $y_{ij}=\sum_{m}x_{ij}(m)$ ). This optimization can be interpreted as trying to minimize the flows along each arc based on relative importance weights $f_{i,C}/f_{ij}$ subject to the flow conservation constraints. For large $f_{i,C}/f_{ij}$ , the flow along arc $(i,j)$ tends to be minimized in order to put more power into communicating to the central station since node $i$ tends is closer to the station than the node $j$ it is communicating with, while for small $f_{i,C}/f_{ij}$ there is less emphasis placed on communicating to the station since node $i$ is closer to node $j$ . The relative path loss ratio $f_{i,C}/f_{ij}$ controls the tradeoff between communication to the central station and node $j$ for each node pair $(i,j)$ .

Once the optimal solution to (Q) is obtained, the optimal power levels may be obtained using:

[TABLE]

and feasibility is easily checked by ensuring $\sum_{j:(i,j)\in\mathcal{E}}P_{ij}^{*}\leq P_{max}$ for all nodes $i\in\mathcal{N}$ . This solution coincides with the solution of (P).

III Simple Distributed Primal-Dual Algorithm for Solving (Q)

The Lagrangian for problem (Q) is given by:

[TABLE]

where $p$ are the Lagrange multipliers associated with the flow of conservation. Using the primal-dual method approach of [8, 9] to find approximate solutions to (Q), the primal and dual updates become: Primal Variable Updates

[TABLE]

Dual Variable Updates

[TABLE]

where $\alpha>0$ is a small step size.

The primal solution after $k$ iterations is obtained by averaging the iterates:

[TABLE]

It is known that this primal solution will be feasible asymptotically. We note that the averages may be implemented recursively as shown above to save memory. Under some mild conditions, this iterative algorithm converges to a saddle point of the Lagrangian function. We remark that these updates can be computed using local computations, so the algorithm may be implemented in a decentralized manner.

IV Distributed Augmented Lagrangian Method for Solving (Q)

The objective function of problem (Q) is convex, twice-differentiable, and monotonically nondecreasing with respect to the flows $\{x_{ij}(m)\}$ . Furthermore, the only constraints present are flow conservation constraints and the flow nonnegativity constraints. This makes the path flow formulation applicable [10]. From the monotonicity of the objective function, an optimal flow vector may be constructed using only simple path flows. Optimizing over the path flows is not very practical since the paths from a source to a destination need to be enumerated first. The problem of finding all such simple paths cannot be accomplished in polynomial time. Furthermore, these types of algorithm in addition to the path augmentation, blocking flows and linear programming require global coordination.

We adopt the augmented Lagrangian (AL) algorithm of [11]. Define ${\mathbf{x}}_{i}(m)=[x_{i1}(m),\dots,x_{iN}(m)]^{T}$ as the vector of flows of commodity $m$ that node $i$ routes towards all other nodes $j$ , and ${\mathbf{x}}_{i}=[{\mathbf{x}}_{i}(1)^{T},\dots,{\mathbf{x}}_{i}(M)^{T}]^{T}$ is the collection of all such vectors making up a local variable of node $i$ . The demand vector for commodity $m$ is given by ${\mathbf{d}}(m)=[d_{1}(m),\dots,d_{N}(m)]^{T}$ . This vector is defined as $d_{i}(m)=+R_{m}$ for $i=i_{m}$ , $d_{i}(m)=-R_{m}$ for $i=j_{m}$ and $d_{i}(m)=0$ for the remaining nodes $i$ . Define the local neighborhood of node $i$ as $\mathcal{N}_{i}$ (not including $i$ ), and the two-hop neighborhood of node $i$ as $\mathcal{T}_{i}$ . Note that $\mathcal{N}_{i}\subseteq\mathcal{T}_{i}$ .

Problem (Q) can be equivalently written as:

[TABLE]

where the matrix ${\mathbf{C}}_{i}\in{\mathbb{R}}^{N\times N}$ is defined as

[TABLE]

In matrix form, ${\mathbf{C}}_{i}={\mathbf{f}}_{i}\otimes{\mathbf{e}}_{i}-{\text{diag}}({\mathbf{f}}_{i})$ , where $[{\mathbf{f}}_{i}]_{\mathcal{N}_{i}}=1$ and zero elsewhere.

The local AL of node $i$ at iteration $k$ is given by:

[TABLE]

where $\rho>0$ is a regularization parameter. The selection vectors are given as $\textbf{a}_{j}=\mathbf{1}_{M}\otimes{\mathbf{e}}_{j}$ . The accelerated distributed augmented Lagrangian (ADAL) method of [11] is summarized in Algorithm 1.

IV-A Minimizing Local AL

The minimization problem (3) can be solved by a projected gradient method [12]. We assume an initial condition of ${\mathbf{x}}_{i}(0)={\mathbf{x}}_{i}^{k}$ and perform iterative updates on the flow vector using Algorithm 2.

In practice to keep computational complexity bounded, we perform only several projected gradient descent iterations until a local convergence criterion is satisfied, i.e., ${\parallel}[{\mathbf{x}}_{i}^{(k)}-\nabla\Lambda_{i}({\mathbf{x}}_{i}^{(k)})]_{+}-{\mathbf{x}}_{i}^{(k)}{\parallel}_{2}\leq\epsilon$ for some small $\epsilon$ .

IV-A1 Gradient Calculation

The gradient takes the form:

[TABLE]

where $\bm{\lambda}^{k}=[\bm{\lambda}(1)^{T},\dots,\bm{\lambda}(M)^{T}]^{T}\in{\mathbb{R}}^{NM}$ , ${\mathbf{z}}_{i}^{k}=[({\mathbf{z}}_{i}(1)^{k})^{T},\dots,({\mathbf{z}}_{i}(M)^{k})^{T}]^{T}\in{\mathbb{R}}^{NM}$ and ${\mathbf{z}}_{i}(m)^{k}=\sum_{j\neq i}{\mathbf{C}}_{j}{\mathbf{x}}_{j}^{k}-{\mathbf{d}}(m)\in{\mathbb{R}}^{N}$ .

We next prove that the gradient may be calculated using local information within two-hop neighbors $j\in\mathcal{T}_{i}$ . The gradient vector may be decomposed as:

[TABLE]

where

[TABLE]

Since the rows $[{\mathbf{C}}_{i}^{T}]_{l,\cdot}=0$ for all $l\notin\mathcal{N}_{i}$ , it follows that ${\mathbf{supp}}(\nabla_{{\mathbf{x}}_{i}(m)}\Lambda_{i}({\mathbf{x}}_{i}))=\mathcal{N}_{i}$ .

We first note that the term ${\mathbf{C}}_{i}^{T}\bm{\lambda}(m)$ is locally computable since:

[TABLE]

so access to $\lambda_{l}(m)$ from all neighbors $l\in\mathcal{N}_{i}$ is only needed.

We then focus our attention to the term ${\mathbf{C}}_{i}^{T}{\mathbf{C}}_{i}{\mathbf{x}}_{i}$ . This is locally computable since:

[TABLE]

Finally, we consider the term ${\mathbf{C}}_{i}^{T}{\mathbf{z}}_{i}(m)^{k}$ , where ${\mathbf{z}}_{i}(m)^{k}=\sum_{j\neq i}{\mathbf{C}}_{j}{\mathbf{x}}_{j}^{k}-{\mathbf{d}}(m)$ . Since ${\mathbf{supp}}({\mathbf{C}}_{i}^{T}{\mathbf{z}}_{i}(m)^{k})=\mathcal{N}_{i}$ from (4), we only focus on calculating this vector for coordinates $l\in\mathcal{N}_{i}$ . Then, for $l\in\mathcal{N}_{i}$ , we have:

[TABLE]

Next, we show that $\sum_{j\neq i}[{\mathbf{C}}_{j}{\mathbf{x}}_{j}]_{i}-\sum_{j\neq i}[{\mathbf{C}}_{j}{\mathbf{x}}_{j}]_{l}$ involves only $x_{j,t}$ for $j\in\mathcal{T}_{i}$ .

[TABLE]

Thus, to compute the gradient, nodes only need information from their two-hop neighbors and do not require global knowledge of the network.

IV-A2 Hessian Calculation

The Hessian matrix of ${\mathbf{\Lambda}}_{i}$ takes the form:

[TABLE]

The Hessian is rank-deficient with rank $M|\mathcal{N}_{i}|$ , depending on the size of the local neighborhood. The full $MN\times MN$ Hessian matrix can be decomposed as:

[TABLE]

where each blockwise component is

[TABLE]

The matrix ${\mathbf{C}}_{i}^{T}{\mathbf{C}}_{i}$ depends on the local neighborhood $\mathcal{N}_{i}$ as:

[TABLE]

We also note that the reduced-dimension Hessian (restricted to $\mathcal{N}_{i}$ ) is always positive definite since $[{\mathbf{C}}_{i}^{T}{\mathbf{C}}_{i}]_{\mathcal{N}_{i}}=\mathbf{1}_{|\mathcal{N}_{i}|}\mathbf{1}_{|\mathcal{N}_{i}|}^{T}+{\mathbf{I}}_{|\mathcal{N}_{i}|}$ . Each submatrix has support: ${\mathbf{supp}}(\nabla_{{\mathbf{x}}_{i}(m)}\nabla_{{\mathbf{x}}_{i}(m^{\prime})}^{T}{\mathbf{\Lambda}}_{i}({\mathbf{x}}_{i}))=\mathcal{N}_{i}\times\mathcal{N}_{i}$ .

IV-A3 Scaled Gradient Direction Calculation

Let $\mathcal{I}_{i}$ denote the set of indexes corresponding to the neighborhood of node $i$ , i.e., $\mathcal{I}_{i}=\{\mathcal{N}_{i}+(m-1)N:m=0,\dots,M-1\}$ . The reduced-dimension Hessian ${\mathbf{H}}_{\mathcal{I}_{i},\mathcal{I}_{i}}=[\nabla^{2}{\mathbf{\Lambda}}_{i}]_{k,l\in\mathcal{I}_{i}}$ has full-rank, and the reduced-dimension gradient $\mathbf{g}_{\mathcal{I}_{i}}=[\nabla{\mathbf{\Lambda}}_{i}]_{k\in\mathcal{I}_{i}}$ has nonzero components in general. Using this index set, we may work over a reduced-dimension space and calculate a scaled gradient-descent direction as:

[TABLE]

with its expanded version defined as $d_{l}=\tilde{d}_{k}I_{\{l\in\mathcal{I}_{i}\}}$ where $l=k+(m-1)N$ for appropriate $k\in\mathcal{N}_{i}$ and $m\in\{0,\dots,M-1\}$ . The diagonal approximation to the Hessian becomes a simple scaling of the gradient. Using this tends to have faster convergence than the unscaled gradient projection method, as the experiments show.

V Simulation Results

We simulate a two-dimensional scenario where a network broadcasts a common message to a central station while maintaining desired data rates among two origin-destination node pairs using multiple beams. A center frequency of $f_{c}=1$ GHz and bandwidth of $W=5$ MHz is used, with a peak transmit power per node of $P_{max}=100$ Watts. In the simulation, we compare our distributed optimization algorithm (denoted as ADAL) with the OSPF routing protocol which routes messages at a certain rate through the minimum distance route, which is efficiently obtained using Dijkstra’s algorithm.

The network consists of $N=36$ nodes arranged approximately in a $6\times 6$ grid as the left panel of Fig. 1 shows. The central station is placed in the center of the grid and two messages are to be routed from node $1$ to $36$ , and from node $6$ to node $31$ , each with desired data rate $R=9$ bits/s/Hz. The middle panel of Fig. 1 shows the OSPF flows which correspond to the minimum distance routes. The right panel shows the optimal flows obtained using the ADAL algorithm, which show that routes around the central station are preferred with appropriate load balancing than full loading the shortest routes. This makes sense since the nodes in the center allocate more power for communicating to the central station and the nodes away from the center are more responsible for carrying out the intra-network communications.

The power used for intra-network communications and communications to the central station is shown in Fig. 2 for each node. ADAL uses significantly less power overall for communications, $22.96$ W, in comparison to OSPF, $705.69$ W, which is a factor of $\times 30$ reduction. As a result, nodes have more power left over to use for maximizing the data rate of the common message to the station receiver. The received power at the central station receiver for OSPF is $2.89$ kW corresponding to a data rate of $R_{C}=68.5$ Mbps, while for ADAL is $3.58$ kW corresponding to a data rate of $R_{C}=71.9$ Mbps. Depending on the distances of nodes to the central receiver and the network topology, this gap in received power may significantly boost the data rate. Our optimization algorithm performs optimal load balancing among different paths which leads to considerably less transmit power used for intra-network communications and allows for more power to be used for broadcasting the common message to the central receiver.

Next, we examine the convergence of the augmented Lagrangian (ADAL) algorithm and compare it with the simple primal-dual method. The constraint violation metric used is

[TABLE]

which is expected to converge to zero as $k$ grows. The objective function error measures the distance from the optimal primal value. Fig. 3 shows that ADAL achieves significantly faster convergence than the simple primal-dual method. The simple primal-dual algorithm is fully decentralized and has guaranteed convergence to the unique minimizer but has slow convergence.

The computational complexity of unscaled and scaled gradient methods for minimizing the local AL function in each iteration of the ADAL algorithm is addressed next. A tolerance threshold of $\epsilon=1e^{-3}$ was set to stop the inner minimization with stopping criterion:

[TABLE]

Figure 4 on the left panel shows a histogram of the number of inner iterations needed to achieve the tolerance $\epsilon$ for the unscaled and scaled gradient methods, respectively, and they are on the same order on average. This metric determines the number of gradient evaluations of the local AL function. The middle panel shows a histogram of the number of Armijo step sizes. The scaled method tends to require step sizes close to unity while the unscaled method requires a lot of tuning and requires significantly smaller step sizes, which implies reduced latency during optimization. The right panel displays the average number of Armijo steps per inner iteration. This is an important complexity metric since it determines the number of local AL function evaluations. The scaled gradient method requires an average of only $1.5$ steps while the unscaled gradient method requires $26.1$ steps on average.

VI Conclusion

We proposed distributed algorithms for power allocation in multibeam directional airborne networks for maximizing data rate for a common message sent by all nodes to a central station receiver while guaranteeing multiple rate demands for intra-network communications. A decomposition approach was applied to the augmented Lagrangian (AL) algorithm for solving the convex optimization problem that arises, and an efficient method for solving each subproblem in each AL iteration was presented in detail. Simulation results show the benefits of our approach in comparison to simple primal dual methods in terms of convergence. Finally, significant power savings are observed for intra-network communications with our optimized routing algorithms in comparison to standard minimum distance routing.

Bibliography12

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] J. Litva and T. K. Lo, Digital Beamforming in Wireless Communications . Artech House, Inc., 1996.
2[2] G. Kuperman, R. Margolies, N. M. Jones, B. Proulx, and A. Narula-Tam, “Uncoordinated MAC for Adaptive Multi-Beam Directional Networks: Analysis and Evaluation,” in International Conference on Computer Communication and Networks (ICCCN) , 2016.
3[3] R. B. Mac Leod and A. Margetts, “Networked Airborne Communications using Adaptive Multi-Beam Directional Links,” in IEEE Aerospace Conference , 2016.
4[4] S. E. Johnston and P. D. Fiore, “Full-duplex Communication via Adaptive Nulling,” in Asilomar Conference on Signals, Systems and Computers , 2013.
5[5] S. Han, C. Lin, Z. Xu, C. Pan, and Z. Pan, “Full duplex: Coming into reality in 2020?” in IEEE Global Communications Conference (GLOBECOM) , 2014.
6[6] X. Wang, H. Huang, and T. Hwang, “On the Capacity Gain from Full-Duplex Communications in a Large Scale Wireless Network,” IEEE Transactions on Mobile Computing , vol. 15, no. 9, September 2016.
7[7] X. Quan, Y. Liu, S. Shao, C. Huang, and Y. Tang, “Impacts of Phase Noise on Digital Self-Iinterference Cancellation in Full-Duplex Communications,” IEEE Transactions on Signal Processing , vol. 65, no. 7, April 2017.
8[8] A. Nedić and A. Ozdaglar, “Approximate Primal Solutions and Rate Analysis for Dual Subgradient Methods,” SIAM Journal on Optimization , vol. 19, no. 4, pp. 1757–1780, 2009.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Distributed Optimization of Multi-Beam Directional Communication Networks

Abstract

I Introduction

II Problem Formulation

Proposition 1**.**

Proof.

III Simple Distributed Primal-Dual Algorithm for Solving (Q)

IV Distributed Augmented Lagrangian Method for Solving (Q)

IV-A Minimizing Local AL

IV-A1 Gradient Calculation

IV-A2 Hessian Calculation

IV-A3 Scaled Gradient Direction Calculation

V Simulation Results

VI Conclusion

Proposition 1.