Federated Fine-Tuning of LLMs on the Very Edge: The Good, the Bad, the Ugly
Herbert Woisetschl\"ager, Alexander Isenko, Shiqiang Wang, Ruben, Mayer, Hans-Arno Jacobsen

TL;DR
This paper explores the feasibility and challenges of fine-tuning large language models on edge devices using federated learning, comparing hardware efficiency with data center GPUs to identify future improvements.
Contribution
It evaluates edge computing systems' capabilities for federated LLM fine-tuning and compares their performance with data-center GPUs, highlighting areas for enhancement.
Findings
Edge systems can perform LLM fine-tuning but with lower efficiency.
Comparison shows significant potential for improving edge hardware performance.
Network utilization varies under realistic conditions, affecting training efficiency.
Abstract
Large Language Models (LLM) and foundation models are popular as they offer new opportunities for individuals and businesses to improve natural language processing, interact with data, and retrieve information faster. However, training or fine-tuning LLMs requires a vast amount of data, which can be challenging to access due to legal or technical restrictions and may require private computing resources. Federated Learning (FL) is a solution designed to overcome these challenges and expand data access for deep learning applications. This paper takes a hardware-centric approach to explore how LLMs can be brought to modern edge computing systems. Our study fine-tunes the FLAN-T5 model family, ranging from 80M to 3B parameters, using FL for a text summarization task. We provide a micro-level hardware benchmark, compare the model FLOP utilization to a state-of-the-art data center GPU, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Ferroelectric and Negative Capacitance Devices · Topic Modeling
MethodsFlan-T5
