Breaking the Memory Wall for Heterogeneous Federated Learning via Progressive Training

Yebo Wu; Li Li; Chengzhong Xu

arXiv:2404.13349·cs.DC·May 22, 2025·1 cites

Breaking the Memory Wall for Heterogeneous Federated Learning via Progressive Training

Yebo Wu, Li Li, Chengzhong Xu

PDF

Open Access

TL;DR

ProFL introduces a progressive training framework for federated learning that reduces memory usage and enhances accuracy by training model blocks sequentially with a novel metric for convergence assessment.

Contribution

The paper proposes a novel progressive training approach with a new metric to efficiently train models in federated learning under memory constraints, enabling deployment on heterogeneous devices.

Findings

01

Reduces peak memory footprint by up to 57.4%.

02

Improves model accuracy by up to 82.4%.

03

Theoretically proves convergence of the proposed method.

Abstract

This paper presents ProFL, a new framework that effectively addresses the memory constraints in FL. Rather than updating the full model during local training, ProFL partitions the model into blocks based on its original architecture and trains each block in a progressive fashion. It first trains the front blocks and safely freezes them after convergence. Training of the next block is then triggered. This process progressively grows the model to be trained until the training of the full model is completed. In this way, the peak memory footprint is effectively reduced for feasible deployment on heterogeneous devices. In order to preserve the feature representation of each block, the training process is divided into two stages: model shrinking and model growing. During the model shrinking stage, we meticulously design corresponding output modules to assist each block in learning the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Brain Tumor Detection and Classification