Lion Cub: Minimizing Communication Overhead in Distributed Lion

Satoki Ishikawa; Tal Ben-Nun; Brian Van Essen; Rio Yokota; Nikoli Dryden

arXiv:2411.16462·cs.LG·July 8, 2025

Lion Cub: Minimizing Communication Overhead in Distributed Lion

Satoki Ishikawa, Tal Ben-Nun, Brian Van Essen, Rio Yokota, Nikoli Dryden

PDF

Open Access

TL;DR

Lion Cub introduces a communication-efficient distributed training method for the Lion optimizer, achieving up to 5x speedups by combining tailored quantization and selective momentum synchronization.

Contribution

The paper presents Lion Cub, a novel approach that reduces communication overhead in distributed Lion training through optimized quantization and momentum synchronization techniques.

Findings

01

Up to 5x speedup in training time.

02

Effective quantization methods for Lion.

03

Reduced communication costs without sacrificing convergence.

Abstract

Communication overhead is a key challenge in distributed deep learning, especially on slower Ethernet interconnects, and given current hardware trends, communication is likely to become a major bottleneck. While gradient compression techniques have been explored for SGD and Adam, the Lion optimizer has the distinct advantage that its update vectors are the output of a sign operation, enabling straightforward quantization. However, simply compressing updates for communication and using techniques like majority voting fails to lead to end-to-end speedups due to inefficient communication algorithms and reduced convergence. We analyze three factors critical to distributed learning with Lion: optimizing communication methods, identifying effective quantization methods, and assessing the necessity of momentum synchronization. Our findings show that quantization techniques adapted to Lion and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIoT and Edge/Fog Computing · Cloud Computing and Resource Management · Mobile Agent-Based Network Management

MethodsAdam · Evolved Sign Momentum · Stochastic Gradient Descent