Newton-ADMM: A Distributed GPU-Accelerated Optimizer for Multiclass Classification Problems
Chih-Hao Fang, Sudhir B Kylasa, Fred Roosta, Michael W. Mahoney,, Ananth Grama

TL;DR
This paper introduces Newton-ADMM, a GPU-accelerated distributed optimizer combining Newton-type methods with ADMM, which improves efficiency and scalability for multiclass classification tasks in distributed environments.
Contribution
The paper presents a novel distributed optimizer that integrates GPU-accelerated Newton methods with ADMM, enhancing performance and scalability for classification problems.
Findings
Better generalization on multiple classification datasets
Significantly faster distributed solution times compared to state-of-the-art methods
Improved scalability on large distributed platforms
Abstract
First-order optimization methods, such as stochastic gradient descent (SGD) and its variants, are widely used in machine learning applications due to their simplicity and low per-iteration costs. However, they often require larger numbers of iterations, with associated communication costs in distributed environments. In contrast, Newton-type methods, while having higher per-iteration costs, typically require a significantly smaller number of iterations, which directly translates to reduced communication costs. In this paper, we present a novel distributed optimizer for classification problems, which integrates a GPU-accelerated Newton-type solver with the global consensus formulation of Alternating Direction of Method Multipliers (ADMM). By leveraging the communication efficiency of ADMM, GPU-accelerated inexact-Newton solver, and an effective spectral penalty parameter selection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and ELM
MethodsAlternating Direction Method of Multipliers
