GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning

Sifan Zhou; Shuo Wang; Zhihang Yuan; Mingjia Shi; Yuzhang Shang; Dawei Yang

arXiv:2502.12913·cs.LG·May 30, 2025

GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning

Sifan Zhou, Shuo Wang, Zhihang Yuan, Mingjia Shi, Yuzhang Shang, Dawei Yang

PDF

Open Access

TL;DR

GSQ-Tuning introduces a fully integer-based fine-tuning method for LLMs that eliminates floating-point operations, significantly reducing memory, power, and hardware requirements, enabling practical on-device adaptation.

Contribution

The paper proposes the Group-Shared Exponents Integer format for fully integer LLM fine-tuning, a novel approach that improves efficiency and hardware compatibility over existing methods.

Findings

01

Achieves accuracy comparable to BF16 fine-tuning.

02

Reduces memory usage by 1.85x.

03

Cuts power consumption by 5x and chip area by 11x compared to FP8.

Abstract

Large Language Models (LLMs) fine-tuning technologies have achieved remarkable results. However, traditional LLM fine-tuning approaches face significant challenges: they require large Floating Point (FP) computation, raising privacy concerns when handling sensitive data, and are impractical for resource-constrained edge devices. While Parameter-Efficient Fine-Tuning (PEFT) techniques reduce trainable parameters, their reliance on floating-point arithmetic creates fundamental incompatibilities with edge hardware. In this work, we introduce a novel framework for on-device LLM fine-tuning that eliminates the need for floating-point operations in both inference and training, named GSQ-Tuning. At its core is the Group-Shared Exponents Integer format, which efficiently represents model parameters in integer format using shared exponents among parameter groups. When combined with LoRA-like…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvancements in Photolithography Techniques