Selective Self-Rehearsal: A Fine-Tuning Approach to Improve   Generalization in Large Language Models

Sonam Gupta; Yatin Nandwani; Asaf Yehudai; Mayank Mishra; Gaurav; Pandey; Dinesh Raghu; Sachindra Joshi

arXiv:2409.04787·cs.CL·September 10, 2024

Selective Self-Rehearsal: A Fine-Tuning Approach to Improve Generalization in Large Language Models

Sonam Gupta, Yatin Nandwani, Asaf Yehudai, Mayank Mishra, Gaurav, Pandey, Dinesh Raghu, Sachindra Joshi

PDF

Open Access

TL;DR

This paper proposes Selective Self-Rehearsal (SSR), a fine-tuning method for large language models that maintains high task performance while significantly improving their ability to generalize to new data, reducing overfitting.

Contribution

SSR introduces a novel fine-tuning approach that uses the model's own correct responses to enhance generalization, outperforming standard supervised fine-tuning in maintaining performance.

Findings

01

SSR reduces performance drop to around 2% on benchmarks.

02

SSR outperforms standard fine-tuning in generalization.

03

Experiments on unanswerable query detection demonstrate effectiveness.

Abstract

Fine-tuning Large Language Models (LLMs) on specific datasets is a common practice to improve performance on target tasks. However, this performance gain often leads to overfitting, where the model becomes too specialized in either the task or the characteristics of the training data, resulting in a loss of generalization. This paper introduces Selective Self-Rehearsal (SSR), a fine-tuning approach that achieves performance comparable to the standard supervised fine-tuning (SFT) while improving generalization. SSR leverages the fact that there can be multiple valid responses to a query. By utilizing the model's correct responses, SSR reduces model specialization during the fine-tuning stage. SSR first identifies the correct model responses from the training set by deploying an appropriate LLM as a judge. Then, it fine-tunes the model using the correct model responses and the gold…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsSparse Evolutionary Training · Shrink and Fine-Tune