Q-LocalAdam: Memory-Efficient Client-Side Adaptive Optimization for Edge Federated Learning
Vedant Waykole, Haroon R. Lone

TL;DR
Q-LocalAdam is a memory-efficient adaptive optimizer for edge federated learning that uses distribution-aware quantization to reduce memory overhead without sacrificing accuracy.
Contribution
It introduces a novel distribution-aware 8-bit quantization method for momentum and variance in Adam, enabling significant memory savings on resource-constrained devices.
Findings
Achieves 3.37x optimizer memory reduction with no accuracy loss under moderate heterogeneity.
Significantly improves accuracy under extreme data heterogeneity, e.g., +5.74 percentage points on CIFAR-100.
Distribution-aware quantization outperforms naive uniform quantization, which degrades performance.
Abstract
Federated learning on edge devices must cope with non-IID client data and tight memory budgets. Adaptive optimizers like Adam stabilize training under data heterogeneity but require storing full-precision momentum and variance states, often tripling client memory overhead. This limits deployable model sizes and concurrent federated jobs on resource-constrained devices. We empirically observe that momentum and variance in federated Adam exhibit fundamentally different statistical properties: momentum values are symmetric and bounded, while variance spans eight orders of magnitude with log-normal structure. Motivated by this asymmetry, we propose \textbf{Q-LocalAdam}, which applies distribution-aware 8-bit quantization block-wise linear encoding for momentum and log-space encoding for variance while keeping model parameters in full precision. Across CIFAR-10 and CIFAR-100 under…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
