Distributed Machine Learning with Sparse Heterogeneous Data

Dominic Richards; Sahand N. Negahban; Patrick Rebeschini

arXiv:1912.01417·math.ST·November 30, 2021·NeurIPS

Distributed Machine Learning with Sparse Heterogeneous Data

Dominic Richards, Sahand N. Negahban, Patrick Rebeschini

PDF

Open Access 1 Video

TL;DR

This paper introduces a distributed learning method for heterogeneous data across graph-structured nodes, leveraging sparsity and total variation penalties to improve model recovery with fewer samples.

Contribution

It proposes a novel Basis Pursuit Denoising approach with total variation regularization tailored for distributed sparse linear models on graphs, with theoretical guarantees.

Findings

01

Successful recovery with fewer samples than independent methods

02

Effective in both noiseless and noisy settings

03

Numerical validation with ADMM and hyperspectral unmixing

Abstract

Motivated by distributed machine learning settings such as Federated Learning, we consider the problem of fitting a statistical model across a distributed collection of heterogeneous data sets whose similarity structure is encoded by a graph topology. Precisely, we analyse the case where each node is associated with fitting a sparse linear model, and edges join two nodes if the difference of their solutions is also sparse. We propose a method based on Basis Pursuit Denoising with a total variation penalty, and provide finite sample guarantees for sub-Gaussian design matrices. Taking the root of the tree as a reference node, we show that if the sparsity of the differences across nodes is smaller than the sparsity at the root, then recovery is successful with fewer samples than by solving the problems independently, or by using methods that rely on a large overlap in the signal supports,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Distributed Machine Learning with Sparse Heterogeneous Data· slideslive

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Statistical Methods and Inference · Optical Imaging and Spectroscopy Techniques