L-FGADMM: Layer-Wise Federated Group ADMM for Communication Efficient   Decentralized Deep Learning

Anis Elgabli; Jihong Park; Sabbir Ahmed; and Mehdi Bennis

arXiv:1911.03654·cs.LG·July 7, 2020

L-FGADMM: Layer-Wise Federated Group ADMM for Communication Efficient Decentralized Deep Learning

Anis Elgabli, Jihong Park, Sabbir Ahmed, and Mehdi Bennis

PDF

TL;DR

L-FGADMM introduces a layer-wise communication strategy in decentralized deep learning, reducing communication costs while maintaining high accuracy comparable to federated learning through a novel regularizing effect.

Contribution

This paper presents a novel layer-wise federated group ADMM algorithm that adaptively reduces communication by selectively skipping layer consensus, enhancing efficiency without sacrificing accuracy.

Findings

01

Significant reduction in communication cost by less frequent exchange of large layers.

02

Achieves test accuracy comparable to federated learning despite decentralized and less frequent communication.

03

Demonstrates a regularizing effect from skipping layer consensus, improving model performance.

Abstract

This article proposes a communication-efficient decentralized deep learning algorithm, coined layer-wise federated group ADMM (L-FGADMM). To minimize an empirical risk, every worker in L-FGADMM periodically communicates with two neighbors, in which the periods are separately adjusted for different layers of its deep neural network. A constrained optimization problem for this setting is formulated and solved using the stochastic version of GADMM proposed in our prior work. Numerical evaluations show that by less frequently exchanging the largest layer, L-FGADMM can significantly reduce the communication cost, without compromising the convergence speed. Surprisingly, despite less exchanged information and decentralized operations, intermittently skipping the largest layer consensus in L-FGADMM creates a regularizing effect, thereby achieving the test accuracy as high as federated learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTest · Alternating Direction Method of Multipliers