A Semantic-Aware Layer-Freezing Approach to Computation-Efficient Fine-Tuning of Language Models

Jian Gu; Aldeida Aleti; Chunyang Chen; Hongyu Zhang

arXiv:2406.11753·cs.CL·June 3, 2025

A Semantic-Aware Layer-Freezing Approach to Computation-Efficient Fine-Tuning of Language Models

Jian Gu, Aldeida Aleti, Chunyang Chen, Hongyu Zhang

PDF

Open Access 1 Video

TL;DR

This paper introduces a semantic-aware, layer-freezing method for more computation-efficient fine-tuning of language models by identifying which layers to fine-tune based on their contribution to loss reduction.

Contribution

It pioneers a layer-level approach to reduce fine-tuning costs by analyzing semantic deviations and estimating layer gains, complementing existing parameter-efficient methods.

Findings

01

Outperforms existing baselines in efficiency and effectiveness

02

Effective across various language models and datasets

03

Provides a practical, orthogonal approach to finetuning improvements

Abstract

Finetuning language models (LMs) is crucial for adapting the models to downstream data and tasks. However, full finetuning is usually costly. Existing work, such as parameter-efficient finetuning (PEFT), often focuses on \textit{how to finetune} but neglects the issue of \textit{where to finetune}. As a pioneering work on reducing the cost of backpropagation (at the layer level) by answering where to finetune, we conduct a semantic analysis of the LM inference process. We first propose using transition traces of the latent representation to compute deviations (or loss). Then, using a derived formula of scaling law, we estimate the gain of each layer in reducing deviation (or loss). Further, we narrow down the scope for finetuning, and also, study the cost-benefit balance of LM finetuning. We perform extensive experiments across well-known LMs and datasets. The results show that our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

A Semantic-Aware Layer-Freezing Approach to Computation-Efficient Fine-Tuning of Language Models· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques