Linear Chain Transformation: Expanding Optimization Dynamics for   Fine-Tuning Large Language Models

Yulong Wang; Chang Zuo; Yin Xuan; Hong Li; Ni Wei

arXiv:2411.00039·cs.CL·November 4, 2024

Linear Chain Transformation: Expanding Optimization Dynamics for Fine-Tuning Large Language Models

Yulong Wang, Chang Zuo, Yin Xuan, Hong Li, Ni Wei

PDF

Open Access

TL;DR

This paper introduces Linear Chain Transformation (LinChain), a novel fine-tuning method for large language models that enhances optimization dynamics by adding linear transformations, leading to better performance and generalization.

Contribution

LinChain is a new approach that incorporates multiple linear transformations into fine-tuning, expanding optimization paths and improving task-specific learning in LLMs.

Findings

01

Significantly improves LLM fine-tuning performance

02

Enhances model generalization and task adaptation

03

Maintains inference efficiency

Abstract

Fine-tuning large language models (LLMs) has become essential for adapting pretrained models to specific downstream tasks. In this paper, we propose Linear Chain Transformation (LinChain), a novel approach that introduces a sequence of linear transformations during fine-tuning to enrich optimization dynamics. By incorporating multiple linear transformations into the parameter update process, LinChain expands the effective rank of updates and enhances the model's ability to learn complex task-specific representations. We demonstrate that this method significantly improves the performance of LLM fine-tuning over state-of-the-art methods by providing more flexible optimization paths during training, while maintaining the inference efficiency of the resulting model. Our experiments on various benchmark tasks show that LinChain leads to better generalization, fewer learnable parameters, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling