TL;DR
This paper introduces a complexity-aware fine-tuning method for large language models that selectively applies reasoning to complex data, improving efficiency and accuracy over standard approaches.
Contribution
It proposes a novel approach that uses entropy to identify complex data, enabling more efficient fine-tuning with less data and better performance.
Findings
Outperforms standard supervised fine-tuning in accuracy.
Uses 81% less data than traditional methods.
Effectively distinguishes data complexity with entropy.
Abstract
General-purpose Large Language Models (LLMs) are frequently fine-tuned through supervised fine-tuning (SFT) to enhance performance in specific domains. Better results can be achieved by distilling the chain-of-thought of a larger model at the cost of numerous expensive calls and a much greater amount of data. We propose a novel blueprint for efficient fine-tuning that uses reasoning only for complex data identified by entropy. Specifically, across three small open models () we split the training data into complexity categories by a single token answer entropy (ROC AUC ), fine-tune large language models (LLMs) via SFT and distillation, and show that our pipeline significantly outperforms the standard SFT approach ( vs average accuracy) and outperforms the distillation approach ( vs average accuracy) while using less data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
MethodsShrink and Fine-Tune
