Distributed Parameter Map-Reduce

Qi Li

arXiv:1510.00817·cs.DC·October 6, 2015

Distributed Parameter Map-Reduce

Qi Li

PDF

Open Access

TL;DR

This paper introduces a distributed map-reduce approach for logistic regression that efficiently handles large-scale data by distributing both samples and parameters across nodes, enabling scalable training in Hadoop environments.

Contribution

It proposes a novel Distributed Parameter Map-Reduce method that distributes parameters alongside data, facilitating scalable logistic regression training in distributed systems.

Findings

01

Linear acceleration with increasing cluster nodes

02

Effective logistic regression training on large datasets

03

Demonstrated in Hadoop production environment

Abstract

This paper describes how to convert a machine learning problem into a series of map-reduce tasks. We study logistic regression algorithm. In logistic regression algorithm, it is assumed that samples are independent and each sample is assigned a probability. Parameters are obtained by maxmizing the product of all sample probabilities. Rapid expansion of training samples brings challenges to machine learning method. Training samples are so many that they can be only stored in distributed file system and driven by map-reduce style programs. The main step of logistic regression is inference. According to map-reduce spirit, each sample makes inference through a separate map procedure. But the premise of inference is that the map procedure holds parameters for all features in the sample. In this paper, we propose Distributed Parameter Map-Reduce, in which not only samples, but also parameters…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGraph Theory and Algorithms · Advanced Image and Video Retrieval Techniques · Advanced Graph Neural Networks

MethodsLogistic Regression