HAFLQ: Heterogeneous Adaptive Federated LoRA Fine-tuned LLM with Quantization

Yang Su; Na Yan; Yansha Deng; Mischa Dohler; and Robert Schober

arXiv:2411.06581·cs.LG·May 19, 2025

HAFLQ: Heterogeneous Adaptive Federated LoRA Fine-tuned LLM with Quantization

Yang Su, Na Yan, Yansha Deng, Mischa Dohler, and Robert Schober

PDF

Open Access

TL;DR

HAFLQ is a comprehensive federated fine-tuning framework for LLMs that employs adaptive quantization, importance-based parameter management, and efficient aggregation to reduce resource usage and improve accuracy in heterogeneous environments.

Contribution

It introduces novel adaptive quantization, parameter truncation, bandwidth-aware quantization, and matrix aggregation strategies for scalable federated LLM fine-tuning.

Findings

01

Reduces memory usage by 31%

02

Lowers communication cost by 49%

03

Improves accuracy by 50%

Abstract

Federated fine-tuning of pre-trained Large Language Models (LLMs) enables task-specific adaptation across diverse datasets while preserving privacy. However, challenges such as high computational and memory demands, heterogeneous client resources, bandwidth constraints, and ineffective global aggregation hinder its efficiency. To address these issues, we propose HAFLQ (Heterogeneous Adaptive Federated Low-Rank Adaptation Fine-tuned LLM with Quantization), a novel framework for efficient and scalable federated fine-tuning of LLMs in heterogeneous environments. To reduce memory and computation demands, we propose a salience-driven adaptive LLM quantization framework that evaluates the importance of transformer blocks using a salience metric and applies adaptive block-wise quantization accordingly. To handle heterogeneous computational capabilities, we propose an importance-based parameter…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIoT and Edge/Fog Computing · Energy Efficient Wireless Sensor Networks · Wireless Sensor Networks for Data Analysis