Refine Large Language Model Fine-tuning via Instruction Vector
Gangwei Jiang, Zhaoyi Li, Defu Lian, Ying Wei

TL;DR
This paper introduces the Instruction Vector framework to analyze and mitigate knowledge forgetting in large language models during fine-tuning, by capturing instruction-related representations and guiding training to preserve original capabilities.
Contribution
The paper proposes the Instruction Vector framework and IV-guided training method to understand and reduce catastrophic forgetting in LLM fine-tuning.
Findings
IVs effectively capture instruction-following capabilities
Fine-tuning adds reasoning patterns rather than erasing skills
IV-guided training improves retention of original capabilities
Abstract
Fine-tuning large language models (LLMs) can cause them to lose their general capabilities. However, the intrinsic mechanisms behind such forgetting remain unexplored. In this paper, we begin by examining this phenomenon by focusing on knowledge understanding and instruction following, with the latter identified as the main contributor to forgetting during fine-tuning. Consequently, we propose the Instruction Vector (IV) framework to capture model representations highly related to specific instruction-following capabilities, thereby making it possible to understand model-intrinsic forgetting. Through the analysis of IV dynamics pre and post-training, we suggest that fine-tuning mostly adds specialized reasoning patterns instead of erasing previous skills, which may appear as forgetting. Building on this insight, we develop IV-guided training, which aims to preserve original computation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
