DUAL-LOCO: Distributing Statistical Estimation Using Random Projections
Christina Heinze, Brian McWilliams, Nicolai Meinshausen

TL;DR
DUAL-LOCO is a communication-efficient distributed estimation algorithm that uses random projections to approximate feature dependencies, achieving better speed and accuracy with minimal communication.
Contribution
It introduces a novel distributed estimation method that leverages random projections for feature-based data distribution, reducing communication overhead.
Findings
Bounded approximation error weakly dependent on number of workers
Outperforms state-of-the-art methods in speed while maintaining accuracy
Effective on various real-world datasets
Abstract
We present DUAL-LOCO, a communication-efficient algorithm for distributed statistical estimation. DUAL-LOCO assumes that the data is distributed according to the features rather than the samples. It requires only a single round of communication where low-dimensional random projections are used to approximate the dependences between features available to different workers. We show that DUAL-LOCO has bounded approximation error which only depends weakly on the number of workers. We compare DUAL-LOCO against a state-of-the-art distributed optimization method on a variety of real world datasets and show that it obtains better speedups while retaining good accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Sparse and Compressive Sensing Techniques
