Balancing Fidelity and Plasticity: Aligning Mixed-Precision Fine-Tuning with Linguistic Hierarchies
Changhai Zhou, Shiyang Zhang, Yuhua Zhou, Qian Qiao, Jun Gao, Shichao Weng, Weizhong Zhang, Cheng Jin

TL;DR
This paper introduces QR-Adaptor, a unified framework that jointly optimizes quantization and adapter parameters in LLM fine-tuning, effectively balancing model fidelity and plasticity within resource constraints.
Contribution
It proposes a novel multi-objective optimization approach that aligns quantization and adapter ranks with linguistic hierarchies, improving resource efficiency and performance.
Findings
Achieves near 16-bit performance with 4-bit memory budget.
Establishes a new Pareto frontier for resource-efficient fine-tuning.
Demonstrates the importance of aligning quantization with linguistic hierarchies.
Abstract
Deploying and fine-tuning Large Language Models (LLMs) on resource-constrained edge devices requires navigating a strict trade-off between memory footprint and task performance. While Quantization-Aware Fine-tuning has emerged as a viable solution, existing paradigms typically decouple quantization and adapter optimization. This separation overlooks a fundamental theoretical constraint we identify as the \textit{Fidelity-Plasticity Trade-off}: a layer's capacity to adapt to new tasks (Plasticity) is inherently constrained by the information capacity of its frozen weights (Fidelity). Aggressively quantizing semantically critical layers creates an information bottleneck that no amount of adapter rank can recover, while high precision in robust syntactic layers wastes valuable memory. To address this, we introduce \textbf{QR-Adaptor}, a unified framework that jointly optimizes per-layer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Big Data and Digital Economy · Natural Language Processing Techniques
