Elastic CoCoA: Scaling In to Improve Convergence

Michael Kaufmann; Thomas Parnell; Kornilios Kourtis

arXiv:1811.02322·cs.LG·November 7, 2018·1 cites

Elastic CoCoA: Scaling In to Improve Convergence

Michael Kaufmann, Thomas Parnell, Kornilios Kourtis

PDF

Open Access

TL;DR

This paper introduces Elastic CoCoA, a framework that dynamically adjusts the number of workers during training to optimize convergence rate, significantly reducing training time across multiple datasets.

Contribution

We propose Chicle, an elastic framework that adaptively tunes the number of workers in CoCoA to improve convergence speed and automate optimal resource allocation.

Findings

01

Chicle accelerates training by up to 5.96x compared to static configurations.

02

The optimal number of workers varies during training, and Chicle adapts effectively.

03

Chicle reliably finds near-optimal worker settings in most cases.

Abstract

In this paper we experimentally analyze the convergence behavior of CoCoA and show, that the number of workers required to achieve the highest convergence rate at any point in time, changes over the course of the training. Based on this observation, we build Chicle, an elastic framework that dynamically adjusts the number of workers based on feedback from the training algorithm, in order to select the number of workers that results in the highest convergence rate. In our evaluation of 6 datasets, we show that Chicle is able to accelerate the time-to-accuracy by a factor of up to 5.96x compared to the best static setting, while being robust enough to find an optimal or near-optimal setting automatically in most cases.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Neural Network Applications · Machine Learning and Data Classification