EdgeFlowerTune: Evaluating Federated LLM Fine-Tuning Under Realistic Edge System Constraints
Jiaxiang Geng, Yiyi Lu, Lunyu Zhao, Yan Gao, Nicholas D. Lane, Bing Luo

TL;DR
EdgeFlowerTune is a comprehensive benchmark for evaluating federated LLM fine-tuning on real edge devices, considering both model quality and system resource constraints like latency, energy, and robustness.
Contribution
It introduces a deployment-oriented benchmark with protocols and a real-device platform to assess federated LLM fine-tuning under practical edge system constraints.
Findings
Methods with similar accuracy can differ significantly in deployability.
Benchmark reveals trade-offs between model quality and system costs.
Real-device evaluation provides more realistic insights than simulation.
Abstract
Federated fine-tuning offers a promising paradigm for adapting large language models (LLMs) on edge devices by leveraging the rich, diverse, and continuously generated data from smartphones and IoT devices without compromising user data privacy. Such edge-side adaptation can improve model personalization, robustness, and responsiveness to local contexts. However, the practical feasibility of federated LLM fine-tuning on real edge devices remains unclear, as most existing work focuses on cross-silo or simulation-based settings, overlooking the resource and runtime constraints that determine whether a method is deployable on real edge systems. We present EdgeFlowerTune, a deployment-oriented benchmark for federated LLM fine-tuning under realistic edge-system constraints. EdgeFlowerTune jointly evaluates model quality and system costs, including communication, wall-clock latency, memory…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
