Randomized Block-Diagonal Preconditioning for Parallel Learning

Celestine Mendler-D\"unner; Aurelien Lucchi

arXiv:2006.13591·cs.LG·December 8, 2020

Randomized Block-Diagonal Preconditioning for Parallel Learning

Celestine Mendler-D\"unner, Aurelien Lucchi

PDF

Open Access 1 Video

TL;DR

This paper introduces a randomized repartitioning technique for block-diagonal preconditioning in gradient-based optimization, significantly enhancing convergence speed in parallel machine learning tasks.

Contribution

It demonstrates that random repartitioning of coordinates improves convergence of block-diagonal preconditioned methods, supported by theoretical analysis and empirical validation.

Findings

01

Repartitioning leads to faster convergence in optimization.

02

Theoretical analysis quantifies expected convergence improvements.

03

Empirical results confirm efficiency gains on various tasks.

Abstract

We study preconditioned gradient-based optimization methods where the preconditioning matrix has block-diagonal form. Such a structural constraint comes with the advantage that the update computation is block-separable and can be parallelized across multiple independent tasks. Our main contribution is to demonstrate that the convergence of these methods can significantly be improved by a randomization technique which corresponds to repartitioning coordinates across tasks during the optimization procedure. We provide a theoretical analysis that accurately characterizes the expected convergence gains of repartitioning and validate our findings empirically on various traditional machine learning tasks. From an implementation perspective, block-separable models are well suited for parallelization and, when shared memory is available, randomization can be implemented on top of existing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Randomized Block-Diagonal Preconditioning for Parallel Learning· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning and ELM · Sparse and Compressive Sensing Techniques