DS-MLR: Exploiting Double Separability for Scaling up Distributed   Multinomial Logistic Regression

Parameswaran Raman; Sriram Srinivasan; Shin Matsushima; Xinhua Zhang,; Hyokun Yun; S.V.N. Vishwanathan

arXiv:1604.04706·cs.LG·August 7, 2018·5 cites

DS-MLR: Exploiting Double Separability for Scaling up Distributed Multinomial Logistic Regression

Parameswaran Raman, Sriram Srinivasan, Shin Matsushima, Xinhua Zhang,, Hyokun Yun, S.V.N. Vishwanathan

PDF

Open Access 1 Repo

TL;DR

This paper introduces DS-MLR, a distributed stochastic gradient descent method that leverages double-separability to efficiently scale multinomial logistic regression to massive datasets with many classes, overcoming computational and storage challenges.

Contribution

The paper presents a novel distributed optimization algorithm exploiting double-separability for simultaneous data and model parallelism in multinomial logistic regression.

Findings

01

Successfully scales to datasets with hundreds of gigabytes

02

Achieves efficient data and model parallelism

03

Handles extreme multi-class classification on Reddit dataset

Abstract

Scaling multinomial logistic regression to datasets with very large number of data points and classes is challenging. This is primarily because one needs to compute the log-partition function on every data point. This makes distributing the computation hard. In this paper, we present a distributed stochastic gradient descent based optimization method (DS-MLR) for scaling up multinomial logistic regression problems to massive scale datasets without hitting any storage constraints on the data and model parameters. Our algorithm exploits double-separability, an attractive property that allows us to achieve both data as well as model parallelism simultaneously. In addition, we introduce a non-blocking and asynchronous variant of our algorithm that avoids bulk-synchronization. We demonstrate the versatility of DS-MLR to various scenarios in data and model parallelism, through an extensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://bitbucket.org/params/dsmlr
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Face and Expression Recognition · Machine Learning and ELM

MethodsLogistic Regression