DS-MLR: Exploiting Double Separability for Scaling up Distributed Multinomial Logistic Regression
Parameswaran Raman, Sriram Srinivasan, Shin Matsushima, Xinhua Zhang,, Hyokun Yun, S.V.N. Vishwanathan

TL;DR
This paper introduces DS-MLR, a distributed stochastic gradient descent method that leverages double-separability to efficiently scale multinomial logistic regression to massive datasets with many classes, overcoming computational and storage challenges.
Contribution
The paper presents a novel distributed optimization algorithm exploiting double-separability for simultaneous data and model parallelism in multinomial logistic regression.
Findings
Successfully scales to datasets with hundreds of gigabytes
Achieves efficient data and model parallelism
Handles extreme multi-class classification on Reddit dataset
Abstract
Scaling multinomial logistic regression to datasets with very large number of data points and classes is challenging. This is primarily because one needs to compute the log-partition function on every data point. This makes distributing the computation hard. In this paper, we present a distributed stochastic gradient descent based optimization method (DS-MLR) for scaling up multinomial logistic regression problems to massive scale datasets without hitting any storage constraints on the data and model parameters. Our algorithm exploits double-separability, an attractive property that allows us to achieve both data as well as model parallelism simultaneously. In addition, we introduce a non-blocking and asynchronous variant of our algorithm that avoids bulk-synchronization. We demonstrate the versatility of DS-MLR to various scenarios in data and model parallelism, through an extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Face and Expression Recognition · Machine Learning and ELM
MethodsLogistic Regression
