How Abilities in Large Language Models are Affected by Supervised   Fine-tuning Data Composition

Guanting Dong; Hongyi Yuan; Keming Lu; Chengpeng Li; Mingfeng Xue,; Dayiheng Liu; Wei Wang; Zheng Yuan; Chang Zhou; Jingren Zhou

arXiv:2310.05492·cs.CL·June 10, 2024·5 cites

How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition

Guanting Dong, Hongyi Yuan, Keming Lu, Chengpeng Li, Mingfeng Xue,, Dayiheng Liu, Wei Wang, Zheng Yuan, Chang Zhou, Jingren Zhou

PDF

Open Access 2 Repos

TL;DR

This paper investigates how supervised fine-tuning data composition affects large language models' abilities across math, code, and general tasks, revealing different scaling behaviors and proposing a new multi-skill learning strategy.

Contribution

It provides a detailed analysis of data composition effects on multiple abilities and introduces the Dual-stage Mixed Fine-tuning strategy to mitigate forgetting.

Findings

01

Mathematical reasoning and code generation improve with more data.

02

General abilities plateau after about a thousand samples.

03

Data composition benefits limited data but can cause conflicts with abundant data.

Abstract

Large language models (LLMs) with enormous pre-training tokens and parameters emerge diverse abilities, including math reasoning, code generation, and instruction following. These abilities are further enhanced by supervised fine-tuning (SFT). While the open-source community has explored ad-hoc SFT for enhancing individual capabilities, proprietary LLMs exhibit versatility across various skills. Therefore, understanding the facilitation of multiple abilities via SFT is paramount. In this study, we specifically focuses on the interplay of data composition between mathematical reasoning, code generation, and general human-aligning abilities during SFT. We propose four intriguing research questions to explore the association between model performance and various factors including data amount, composition ratio, model size and SFT strategies. Our experiments reveal that distinct…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning in Materials Science

MethodsShrink and Fine-Tune · Focus