Accelerated Parallel Optimization Methods for Large Scale Machine   Learning

Haipeng Luo; Patrick Haffner; Jean-Francois Paiement

arXiv:1411.6725·cs.LG·November 26, 2014·2 cites

Accelerated Parallel Optimization Methods for Large Scale Machine Learning

Haipeng Luo, Patrick Haffner, Jean-Francois Paiement

PDF

Open Access

TL;DR

This paper develops accelerated parallel optimization algorithms combining Nesterov's acceleration and parallelism to improve efficiency and scalability for large-scale machine learning problems, especially with high-dimensional data.

Contribution

It introduces an accelerated parallel version of Shotgun, improving convergence rates, and refines the analysis of BOOM, providing a unified framework for related methods.

Findings

01

Accelerated Shotgun achieves faster convergence rate of O(1/t^2).

02

Refined sparsity measurement improves BOOM's performance.

03

Unified framework simplifies analysis of parallel optimization methods.

Abstract

The growing amount of high dimensional data in different machine learning applications requires more efficient and scalable optimization algorithms. In this work, we consider combining two techniques, parallelism and Nesterov's acceleration, to design faster algorithms for L1-regularized loss. We first simplify BOOM, a variant of gradient descent, and study it in a unified framework, which allows us to not only propose a refined measurement of sparsity to improve BOOM, but also show that BOOM is provably slower than FISTA. Moving on to parallel coordinate descent methods, we then propose an efficient accelerated version of Shotgun, improving the convergence rate from $O (1/ t)$ to $O (1/ t^{2})$ . Our algorithm enjoys a concise form and analysis compared to previous work, and also allows one to study several connected work in a unified way.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Neural Networks and Applications · Sparse and Compressive Sensing Techniques