Breaking the Capacity Bottleneck in Model-Heterogeneous Federated Learning via Gradual Model Restoration

Chengjie Ma; Seungeun Oh; Jihong Park; Seong-Lyun Kim

arXiv:2512.05372·cs.DC·May 12, 2026

Breaking the Capacity Bottleneck in Model-Heterogeneous Federated Learning via Gradual Model Restoration

Chengjie Ma, Seungeun Oh, Jihong Park, Seong-Lyun Kim

PDF

TL;DR

FedGMR introduces a gradual model restoration approach in federated learning to improve convergence and accuracy in heterogeneous, bandwidth-constrained environments by progressively increasing client sub-model density.

Contribution

The paper proposes FedGMR, a novel federated learning framework with a gradual model restoration technique that adapts sub-model sizes during training to handle heterogeneity effectively.

Findings

01

FedGMR accelerates convergence compared to fixed sub-model approaches.

02

It achieves higher final accuracy on multiple datasets under heterogeneity.

03

Theoretical analysis confirms convergence guarantees and error bounds.

Abstract

Federated learning (FL) enables distributed model training, yet in heterogeneous deployments, Bandwidth-Constrained Clients (BCCs) often contribute inefficiently due to limited uplink bandwidth. In model-heterogeneous FL with fixed small sub-models, BCCs may improve quickly in early rounds but become under-parameterized later, resulting in slow convergence and poor generalization. To address this challenge, we propose FedGMR, a federated learning framework centered around Gradual Model Restoration (GMR), where GMR progressively increases each client's sub-model density during training, allowing BCCs to remain effective contributors throughout optimization. To make GMR practical under real-world heterogeneity, FedGMR is realized as an end-to-end workflow with asynchronous coordination and stable, mask-aware aggregation. We further establish convergence guarantees, showing that the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.