Progtuning: Progressive Fine-tuning Framework for Transformer-based Language Models
Xiaoshuang Ji, Zhendong Zhao, Xiaojun Chen, Xin Zhao, and Zeyao Liu

TL;DR
Progtuning is a novel progressive fine-tuning framework for Transformer models that reduces parameter updates by about 25% based on block contribution, maintaining performance and improving resource efficiency.
Contribution
It introduces a progressive learning approach that allocates updates unevenly across Transformer blocks, enhancing efficiency over existing methods.
Findings
Reduces parameter updates by approximately 25%.
Maintains competitive performance across tasks.
Highly adaptable with existing parameter-efficient methods.
Abstract
Fine-tuning is a promising technique for leveraging Transformer-based language models in downstream tasks. As model sizes continue to grow, updating all model parameters becomes increasingly costly. Parameter-efficient fine-tuning methods effectively address this issue by selectively updating a small subset of parameters. However, fine-tuning and most existing parameter-efficient fine-tuning methods require updating the same number of parameters as the initial size, ignoring the unequal contribution across Transformer blocks and leading to extremely inefficient allocation of computing resources. In this paper, we propose Progtuning, the novel fine-tuning framework combined with progressive learning for Transformer-based language models. Specifically, Progtuning progressively reduces the number of updated transformer blocks based on the contribution. Remarkably, Progtuning optimizes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsLayer Normalization · Dropout · Absolute Position Encodings · Dense Connections · Byte Pair Encoding · Softmax · Label Smoothing · Transformer
