Labeling supervised fine-tuning data with the scaling law

Huanjun Kong

arXiv:2405.02817·cs.CL·August 19, 2024

Labeling supervised fine-tuning data with the scaling law

Huanjun Kong

PDF

Open Access 2 Repos

TL;DR

This paper presents a method for creating high-quality supervised fine-tuning data for large language models using the scaling law, demonstrating improved performance on NLP tasks with limited resources.

Contribution

It introduces a novel data annotation approach based on the scaling law and provides a set of fine-tuning data and LoRA weights for LLMs, with open-source resources.

Findings

01

Optimal fine-tuned model improved 29.07 in F1 score

02

Validated the effectiveness of scaling law-based data annotation

03

Demonstrated viability of LLM fine-tuning for downstream NLP tasks

Abstract

This paper introduces a multi-stage manual annotation calibrated by the scaling law, offering a high-quality Supervised Fine-Tuning data acquisition method for environments with constrained resources like GPU poor, limited GPT access, and funding restrictions. We have preprocessed 58k authentic chat data and manually annotated 2.3k questions. After this, we conducted fine-tuning on Qwen models, ranging from 0.5B to 32B parameters. The optimal version improved 29.07 in F1 score. This confirms the viability of fine-tuning Large Language Model (LLM) for downstream Natural Language Processing (NLP) tasks. Our contributions are: 1) Created Supervised Fine-Tuning (SFT) training data in alpaca format, along with a set of Low-Rank Adaptation (LoRA) weights, and 2) Developed a method for acquiring high-quality data leveraging scaling law principle. The script, raw data with alpaca format and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOpinion Dynamics and Social Influence · Speech Recognition and Synthesis · Speech and dialogue systems

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Sparse Evolutionary Training · Cosine Annealing · Softmax · Linear Layer · Attention Dropout · Dropout · Linear Warmup With Cosine Annealing · Discriminative Fine-Tuning