Robust Group LASSO Over Decentralized Networks
Manxi Wang, Yongcheng Li, Xiaohan Wei, Qing Ling

TL;DR
This paper develops decentralized algorithms for robustly recovering group sparse signals over multi-agent networks with sparse errors, using dynamic consensus strategies to replace centralized processing.
Contribution
It introduces a decentralized approach for robust group LASSO signal recovery that avoids reliance on a central fusion center, utilizing dynamic average consensus techniques.
Findings
Algorithms effectively recover signals in simulations.
Decentralized method matches centralized performance.
Dynamic consensus enables real-time tracking.
Abstract
This paper considers the recovery of group sparse signals over a multi-agent network, where the measurements are subject to sparse errors. We first investigate the robust group LASSO model and its centralized algorithm based on the alternating direction method of multipliers (ADMM), which requires a central fusion center to compute a global row-support detector. To implement it in a decentralized network environment, we then adopt dynamic average consensus strategies that enable dynamic tracking of the global row-support detector. Numerical experiments demonstrate the effectiveness of the proposed algorithms.
| Given: measurement ; sensing matrices ; parameters and |
| Initialize: signal ; error ; multiplier |
| while not converged () for all do |
| for |
| , |
| , |
| end for |
| , |
| end while |
| Given: measurement ; sensing matrices ; parameters and |
| Initialize: signal ; error ; multiplier |
| while not converged () agent do |
| for |
| , |
| is updated through an average consensus strategy |
| , |
| end for |
| , |
| , |
| end while |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Distributed Sensor Networks and Detection Algorithms · Target Tracking and Data Fusion in Sensor Networks
Robust Group LASSO Over Decentralized Networks
Abstract
This paper considers the recovery of group sparse signals over a multi-agent network, where the measurements are subject to sparse errors. We first investigate the robust group LASSO model and its centralized algorithm based on the alternating direction method of multipliers (ADMM), which requires a central fusion center to compute a global row-support detector. To implement it in a decentralized network environment, we then adopt dynamic average consensus strategies that enable dynamic tracking of the global row-support detector. Numerical experiments demonstrate the effectiveness of the proposed algorithms.
**Index Terms— ** Decentralized optimization, dynamic average consensus, group sparsity, alternating direction method of multipliers (ADMM)
1 Introduction
Suppose that distributed agents constitute a bidirectionally connected network and sense correlated signals under sparse measurement errors. The measurement equation of agent is
[TABLE]
where is the measurement vector, is the sensing matrix, is the unknown signal vector, and is the unknown sparse error vector. We are particularly interested in a certain correlation pattern of the signal vectors, where the signal matrix is group sparse, meaning that is sparse and its nonzero entries appear in a small number of common rows. Define as the measurement matrix and as the sparse error matrix, the matrix form of the agents’ measurement equations is
[TABLE]
Given and ’s, the goal of the network is to recover and from the linear measurement equation (2).
1.1 Robust Group LASSO Model
The recovery of group sparse (also known as block sparse [3] or jointly sparse [4]) signals finds a variety of applications such as direction-of-arrival estimation [5, 6], collaborative spectrum sensing [7, 8, 9] and motion detection [10]. A well-known model to recover group sparse signals is group LASSO (least absolute shrinkage and selection operator) [11], which solves
[TABLE]
Here is a nonnegative trade-off parameter. A key assumption leading to the success of such model is the sub-Gaussianity of errors. However, in many applications, the measurements of the agents may be seriously contaminated or even missing due to uncertainties such as sensor failure or transmission errors. This kind of measurement errors are often sparse [12]. Hence, a natural extension of (3) is to exploit the structures of both the signal matrix and the sparse error matrix by solving
[TABLE]
This model is termed as robust group LASSO, whose performance guarantee is given in [13]. Under mild conditions, the robust group LASSO model is able to simultaneously recover the true values of and with high probability.
1.2 Our Contributions
This paper develops efficient algorithms to solve the robust group LASSO model (4). Our contributions are as follows.
- (i)
We propose a centralized algorithm that is based on the alternating direction method of multipliers (ADMM), a powerful operator-splitting technique. One subproblem of the centralized algorithm is the traditional group LASSO model, which is approximately solved by a block coordinate descent (BCD) approach through successively estimating the row-support of the signal matrix . 2. (ii)
We develop decentralized versions of the above algorithm that are suitable for autonomous computation over large-scale networks. Since estimating the row-support of the signal matrix requires collaborative information fusion of all the agents, we propose to achieve inexact information fusion through dynamic average consensus techniques, which only require information exchange among neighboring agents.
1.3 Notations
Matrices are denoted by bold uppercase letters and vectors are denoted by bold lowercase letters. For a matrix , denotes its -th row, denotes its -th column, while denotes its -th element. The -norm of is , the -norm is , and the Frobenius norm is .
The multi-agent network is described as a bidirectional graph . If two agents are neighbors, then they can communicate with each other within one hop, and is a bidirectional communication edge.
2 Centralized Robust Group LASSO
Optimally solving (4) is nontrivial since the objective function is a weighted summation of two nonsmooth functions and , where and are entangled in the constraint. Therefore we resort to the alternating direction method of multipliers (ADMM) to split the two entangled variables and such that the resulting subproblems are easier to solve.
2.1 Using ADMM to Solve (4)
The augmented Lagrangian function of (4) is
[TABLE]
where is the Lagrange multiplier and is a positive penalty parameter. The ADMM alternatingly minimizes the augmented Lagrangian function with respect to and , and then updates the Lagrange multiplier [14]. At time , the ADMM works as follows.
First, fixing and , we minimize the augmented Lagrangian function respect to to get . Simple manipulation shows that it is equivalent to
[TABLE]
Note that (5) is a standard group lasso problem that generally does not have a closed-form solution. We will develop an efficient algorithm to solve (5) later in this section.
Second, fixing and , we minimize the augmented Lagrangian function respect to to get . Again, combining the linear term with the quadratic term of yields
[TABLE]
Denoting , (6) has a closed-form solution given by
[TABLE]
where is the sign function; and denote the -th entries of and , respectively. Note that the term can be viewed as the support detector of the -th element of . If is smaller than the threshold , then is set to be zero.
Finally, given and , the Lagrange multiplier is updated according to the following formula
[TABLE]
Since the update of in (7) and the update of in (8) are both simple, now we focus on the update of in (5) that is the bottleneck of the ADMM. Observe that in (5) the -norm term is separable with respect to ’s but nonsmooth, while the Frobenius term is smooth but nonseparable with respect to ’s. Therefore, in this paper we solve (5) with the block coordinate descent (BCD) algorithm that has shown to be an efficient tool to handle this special problem structure [15, 16, 17].
2.2 Using BCD to Solve (5)
To set up the iterative BCD algorithm that solves (5) at time , we divide time into slots. At time slot (), we linearize the Frobenius norm term in (5) with respect to and add an extra quadratic regularization term, which gives
[TABLE]
where is a positive proximal parameter and the -th column of is defined as
[TABLE]
Note that (9) is equivalent to
[TABLE]
which has a closed-form solution given by the soft-thresholding operator [18]. Denote whose -th row is given by . Also denote as the solution of (11). The -th row of is
[TABLE]
Again, note that the term can be viewed as the row-support detector of the -th row of . If is smaller than the threshold , then is set to be zero.
2.3 Implementation of Centralized Robust Group LASSO
The centralized ADMM to solve the robust group LASSO model (4) is summarized in Table I. Each iteration of the ADMM includes an inner-loop BCD subroutine that updates through solving (5), the update of that has a closed-form solution (7), and the update of in (8). The ADMM parameter can be any positive value, though its choice may influence the convergence rate. The BCD parameter is set to be the minimum of largest eigenvalues of that guarantees the convergence of the BCD subroutine [15, 16, 17]. As long as is properly chosen and is large enough, the BCD subroutine is able to solve the subproblem (5) with enough accuracy such that the ADMM converges to the global minimum of the convex program (4).
The algorithm outlined in Table I is centralized, which means that a fusion center is necessary to gather information from all the agents and conduct optimization. This centralized scheme is sensitive to the failure of the fusion center, requires multi-hop communication within the network, and is hence unscalable with respect to the networks size. In view of the need of decentralized optimization for large-scale networks, we discuss how to implement it in a decentralized manner, as shown in the next section.
3 Decentralized Robust Group LASSO
Observe that Algorithm 1 is naturally distributed, except for the update of , which involves calculating the global row-support detector across agents. Hence, given the vector , the key to the decentralized implementation of Algorithm 1 is how to calculate its -norm in a decentralized manner. Recall that
[TABLE]
where
[TABLE]
is the average of the squares. Therefore, the problem becomes: Suppose each agent holds the value of , how can we design efficient strategies to (exactly or inexactly) calculate their mean in a decentralized manner? Below we consider three approaches to obtain the average.
3.1 Static Average Consensus
The first strategy comes from the classic average consensus algorithm [19]. Calculate
[TABLE]
where is a row vector containing all , means element-wise squares of , is a large iteration number, and is the mixing matrix. The mixing matrix is doubly stochastic, and its -th element is nonzero if and only if or . A typical choice of follows the Metropolis-Hastings rule [19],
[TABLE]
Here is the degree of agent .
Obviously, the graph-sparse structure of the mixing matrix enables decentralized computation of . According to the theory of average consensus [19], if goes to infinity, then all the elements of converge to the expected average , in which the decentralized implementation is equivalent to its centralized counterpart. However, increasing means introducing more rounds of communication and computation, implying that setting large is inefficient. On the other hand, setting small (say, ) often leads to unsatisfactory result.
3.2 Dynamic Average Consensus
The above-mentioned dilemma motivates us to introduce a new scheme to dynamically calculate the row-support detector. To simplify the algorithmic protocol, we allow neighboring agents to exchange only one round of information. Under this setting, every agent holds a dynamic value , while all the agents manage to track their dynamic average with one round of communication. Apparently, if the values of change irregularly, the agents have no chance to reach their exact dynamic average. Nevertheless, observe that if the values of converge to their steady states, convergence of the dynamic average will be possible. We consider two dynamic average consensus strategies proposed by [20].
First-order dynamic average consensus. Calculate
[TABLE]
[TABLE]
Second-order dynamic average consensus. Calculate
[TABLE]
[TABLE]
[TABLE]
[TABLE]
3.3 Implementation of Centralized Robust Group LASSO
The decentralized group LASSO algorithm is outlined in Table II. It is very close to the centralized algorithm in Table I, except that the row-support detector is successively approximated through static and dynamic average consensus strategies.
If the static average consensus strategy is adopted, then at time slot , the network needs rounds of information exchange. The number of round reduces to one in the two dynamic average consensus strategies. Observe that in each round of first-order dynamic average consensus, agent requires from all of its neighbors . However, in each round of second-order dynamic average consensus, agent requires both and from all of its neighbors . Therefore, the second-order strategy doubles the communication cost per time slot, compared to its first-order counterpart.
With particular note, when is set to be large enough in the static average consensus strategy, the average consensus is exact. Therefore, the resulting decentralized algorithm enjoys the same convergence guarantee as the centralized one, at the cost of unaffordable communication cost. Embedding the two dynamic average consensus strategies saves remarkable communication cost, but makes convergence analysis a challenging task. We will leave it as our future work.
In addition, to avoid possible computational instability, we also set safeguards to the value of . If going beyond the region of , its value is set to the nearest boundary.
4 Numerical Experiments
In the numerical experiments, we consider a network of agents. The dimension of every signal vector is , while the dimension of every measurement vector is . The group sparse signal matrix has nonzero rows (row sparsity ratio is ), whose positions are uniformly randomly chosen. The amplitudes of the nonzero elements follow i.i.d. uniform distribution within . Elements of every sensing matrix follow i.i.d. standard normal distribution. The sparse error matrix has nonzero elements (sparsity ratio is ), whose positions are uniformly randomly chosen and the amplitudes follow i.i.d. uniform distribution within .
In the robust group LASSO model, the weight parameter . The ADMM parameter is also set as . The BCD parameter is set to be the minimum of largest eigenvalues of . Every iteration of the ADMM algorithm is divided into slots so as to run the BCD subroutine. For the static average consensus strategy, we let , meaning that each slot requires rounds of communication. For the dynamic average consensus strategies, we let the safeguards and . The performance metric is relative error, defined as the Frobenius distance between the true solving (4) and the estimated one by ADMM, normalized by the Frobenius norm of .
We first compare the centralized algorithm and the three decentralized ones, as depicted in Fig. 1. The connectivity ratio of the network (the percentage of randomly connected edges out of all possible ones) is . The curve of the centralized algorithm coincides with that using static average consensus. Recall that static average consensus incurs round of communications at every time slot, and is hence expensive. In contrast, the dynamic average consensus strategies demonstrate satisfactory convergence properties, though yielding slightly degraded estimates. Particularly, the second-order dynamic average consensus is close to the centralized one in terms of the relative error.
In the second set of numerical experiments, we vary the connectivity ratio to observe its impact on the decentralized algorithms, as shown in Fig. 2. When the connectivity ratio decreases, the performance of the static average consensus degrades significantly. The reason is that a lower connectivity ratio reduces the speed of network information fusion, and hence makes the static average consensus less accurate under a given . The two dynamic average consensus strategies, on the other hand, are not very sensitive to the variation of connectivity ratio.
The numerical experiments validate the effectiveness of using dynamic average consensus to decentralize computation over networks. Though its theoretical properties in tracking problems have been investigated [20], its interplay with the overall optimization scheme is still unclear, and shall be our future research focus.
Acknowledgement. Qing Ling is supported in part by NSF China grant 61573331 and NSF Anhui grant 1608085QF130.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1]
- 2[2]
- 3[3] Y. Eldar, P. Kuppinger, and H. Bölcskei, “Block-sparse signals: Uncertainty relations and efficient recovery,” IEEE Transactions on Signal Processing , vol. 58, no. 6, pp. 3042–3054, 2010.
- 4[4] M. E. Davis and Y. C. Eldar, “Rank awareness in joint sparse recovery,” IEEE Transactions on Information Theory , vol. 58, no. 2, pp. 1135-146, 2012.
- 5[5] D. Malioutov, M. Çetin, and A. S. Willsky, “A sparse signal reconstruction perspective for source localization with sensor arrays,” IEEE Transactions on Signal Processing , vol. 53, no. 8, pp. 3010–3022, 2005.
- 6[6] X. Wei, Y. Yuan, and Q. Ling, “DOA estimation using a greedy block coordinate descent algorithm,” IEEE Transactions on Signal Processing , vol. 60, no. 12 pp. 6382–6394, 2012.
- 7[7] F. Zeng, C. Li and Z. Tian, “Distributed compressive spectrum sensing in cooperative multihop cognitive networks,” IEEE Journal of Selected Topics in Signal Processing , vol. 5, no. 2, pp. 37–48, 2011.
- 8[8] J. Meng, W. Yin, H. Li, E. Hossain, and Z. Han, “Collaborative spectrum sensing from sparse observations in cognitive radio networks,” IEEE Journal on Selected Areas in Communications , vol. 29, no. 2, pp. 327–337, 2011.
