60 Data Points are Sufficient to Fine-Tune LLMs for Question-Answering

Junjie Ye; Yuming Yang; Qi Zhang; Tao Gui; Xuanjing Huang; Peng Wang,; Zhongchao Shi; Jianping Fan

arXiv:2409.15825·cs.CL·January 22, 2025

60 Data Points are Sufficient to Fine-Tune LLMs for Question-Answering

Junjie Ye, Yuming Yang, Qi Zhang, Tao Gui, Xuanjing Huang, Peng Wang,, Zhongchao Shi, Jianping Fan

PDF

Open Access

TL;DR

This paper demonstrates that fine-tuning large language models for question-answering can be effectively achieved with as few as 60 data points, highlighting the importance of dataset selection and model-specific data requirements.

Contribution

It provides empirical evidence that only 60 data points are sufficient for effective fine-tuning of LLMs for QA, and analyzes how dataset memory levels influence performance across different models.

Findings

01

60 data points suffice for activating pre-trained knowledge

02

Dataset memory levels significantly affect model performance

03

Optimal fine-tuning data varies across different LLMs

Abstract

Large language models (LLMs) encode extensive world knowledge through pre-training on massive datasets, which can then be fine-tuned for the question-answering (QA) task. However, effective strategies for fine-tuning LLMs for the QA task remain largely unexplored. To address this gap, we categorize supervised fine-tuning (SFT) data based on the extent of knowledge memorized by the pretrained LLMs and conduct a series of empirical analyses. Our experiments, involving four LLMs from three different model families, focus on three key factors: the amount of data required for SFT, the impact of different SFT datasets on model performance, and how data requirements vary across LLMs. The results show that as few as 60 data points during the SFT stage can activate the knowledge encoded during pre-training, enabling LLMs to perform the QA task. Additionally, SFT with data of varying memory…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Expert finding and Q&A systems · Speech and dialogue systems

MethodsShrink and Fine-Tune · Focus