Labeling supervised fine-tuning data with the scaling law
Huanjun Kong

TL;DR
This paper presents a method for creating high-quality supervised fine-tuning data for large language models using the scaling law, demonstrating improved performance on NLP tasks with limited resources.
Contribution
It introduces a novel data annotation approach based on the scaling law and provides a set of fine-tuning data and LoRA weights for LLMs, with open-source resources.
Findings
Optimal fine-tuned model improved 29.07 in F1 score
Validated the effectiveness of scaling law-based data annotation
Demonstrated viability of LLM fine-tuning for downstream NLP tasks
Abstract
This paper introduces a multi-stage manual annotation calibrated by the scaling law, offering a high-quality Supervised Fine-Tuning data acquisition method for environments with constrained resources like GPU poor, limited GPT access, and funding restrictions. We have preprocessed 58k authentic chat data and manually annotated 2.3k questions. After this, we conducted fine-tuning on Qwen models, ranging from 0.5B to 32B parameters. The optimal version improved 29.07 in F1 score. This confirms the viability of fine-tuning Large Language Model (LLM) for downstream Natural Language Processing (NLP) tasks. Our contributions are: 1) Created Supervised Fine-Tuning (SFT) training data in alpaca format, along with a set of Low-Rank Adaptation (LoRA) weights, and 2) Developed a method for acquiring high-quality data leveraging scaling law principle. The script, raw data with alpaca format and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOpinion Dynamics and Social Influence · Speech Recognition and Synthesis · Speech and dialogue systems
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Sparse Evolutionary Training · Cosine Annealing · Softmax · Linear Layer · Attention Dropout · Dropout · Linear Warmup With Cosine Annealing · Discriminative Fine-Tuning
