Optimization for Large-Scale Machine Learning with Distributed Features and Observations
Alexandros Nathan, Diego Klabjan

TL;DR
This paper introduces two novel doubly distributed optimization algorithms designed for large-scale machine learning where both data observations and features are partitioned across multiple nodes, improving scalability and performance.
Contribution
The paper proposes the first two doubly distributed optimization algorithms, expanding the scope of distributed methods to handle both features and observations partitioned across nodes.
Findings
Algorithms outperform block distributed ADMM in experiments
Methods scale well with data size and number of nodes
Empirical results demonstrate improved efficiency and convergence
Abstract
As the size of modern data sets exceeds the disk and memory capacities of a single computer, machine learning practitioners have resorted to parallel and distributed computing. Given that optimization is one of the pillars of machine learning and predictive modeling, distributed optimization methods have recently garnered ample attention in the literature. Although previous research has mostly focused on settings where either the observations, or features of the problem at hand are stored in distributed fashion, the situation where both are partitioned across the nodes of a computer cluster (doubly distributed) has barely been studied. In this work we propose two doubly distributed optimization algorithms. The first one falls under the umbrella of distributed dual coordinate ascent methods, while the second one belongs to the class of stochastic gradient/coordinate descent hybrid…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Advanced Bandit Algorithms Research · Neural Networks and Applications
MethodsAlternating Direction Method of Multipliers
