Balancing Fidelity and Plasticity: Aligning Mixed-Precision Fine-Tuning with Linguistic Hierarchies

Changhai Zhou; Shiyang Zhang; Yuhua Zhou; Qian Qiao; Jun Gao; Shichao Weng; Weizhong Zhang; Cheng Jin

arXiv:2505.03802·cs.LG·January 6, 2026

Balancing Fidelity and Plasticity: Aligning Mixed-Precision Fine-Tuning with Linguistic Hierarchies

Changhai Zhou, Shiyang Zhang, Yuhua Zhou, Qian Qiao, Jun Gao, Shichao Weng, Weizhong Zhang, Cheng Jin

PDF

Open Access

TL;DR

This paper introduces QR-Adaptor, a unified framework that jointly optimizes quantization and adapter parameters in LLM fine-tuning, effectively balancing model fidelity and plasticity within resource constraints.

Contribution

It proposes a novel multi-objective optimization approach that aligns quantization and adapter ranks with linguistic hierarchies, improving resource efficiency and performance.

Findings

01

Achieves near 16-bit performance with 4-bit memory budget.

02

Establishes a new Pareto frontier for resource-efficient fine-tuning.

03

Demonstrates the importance of aligning quantization with linguistic hierarchies.

Abstract

Deploying and fine-tuning Large Language Models (LLMs) on resource-constrained edge devices requires navigating a strict trade-off between memory footprint and task performance. While Quantization-Aware Fine-tuning has emerged as a viable solution, existing paradigms typically decouple quantization and adapter optimization. This separation overlooks a fundamental theoretical constraint we identify as the \textit{Fidelity-Plasticity Trade-off}: a layer's capacity to adapt to new tasks (Plasticity) is inherently constrained by the information capacity of its frozen weights (Fidelity). Aggressively quantizing semantically critical layers creates an information bottleneck that no amount of adapter rank can recover, while high precision in robust syntactic layers wastes valuable memory. To address this, we introduce \textbf{QR-Adaptor}, a unified framework that jointly optimizes per-layer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Big Data and Digital Economy · Natural Language Processing Techniques