Personalized Federated Fine-Tuning for LLMs via Data-Driven Heterogeneous Model Architectures
Yicheng Zhang, Zhen Qin, Zhaomin Wu, Jian Hou, Shuiguang Deng

TL;DR
This paper introduces FedAMoLE, a federated learning framework that personalizes large language model fine-tuning by enabling data-driven heterogeneous architectures, improving client performance while maintaining efficiency.
Contribution
FedAMoLE is the first federated learning approach to support heterogeneous model architectures tailored to client data distributions, enhancing personalization and performance.
Findings
FedAMoLE improves client performance by an average of 5.97%.
It maintains practical memory, communication, and computation overhead.
Experiments across seven scenarios validate its effectiveness.
Abstract
Large language models (LLMs) are increasingly powering web-based applications, whose effectiveness relies on fine-tuning with large-scale instruction data. However, such data often contains valuable or sensitive information that limits its public sharing among business organizations. Federated learning (FL) enables collaborative fine-tuning of LLMs without accessing raw data. Existing approaches to federated LLM fine-tuning usually adopt a uniform model architecture, making it challenging to fit highly heterogeneous client-side data in varying domains and tasks, e.g., hospitals and financial institutions conducting federated fine-tuning may require different LLM architectures due to the distinct nature of their domains and tasks. To address this, we propose FedAMoLE, a lightweight personalized FL framework that enables data-driven heterogeneous model architectures. It features a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies
MethodsALIGN
