Logits-Based Finetuning
Jingyao Li, Senqiao Yang, Sitong Wu, Han Shi, Chuanyang Zheng, Hong Xu, Jiaya Jia

TL;DR
This paper introduces a logits-based fine-tuning method for large language models that combines teacher logits with ground truth labels, leading to improved accuracy and linguistic diversity over traditional supervised fine-tuning.
Contribution
The paper proposes a novel logits-based fine-tuning framework that enriches training targets by integrating teacher logits with ground truth labels, enhancing model performance.
Findings
Achieved 18% accuracy improvement on Mawps.
Achieved 22.7% accuracy improvement on TabMWP.
Outperformed prior supervised fine-tuning models across nine benchmarks.
Abstract
In recent years, developing compact and efficient large language models (LLMs) has emerged as a thriving area of research. Traditional Supervised Fine-Tuning (SFT), which relies on singular ground truth labels, often fails to capture token-level dependencies and linguistic diversity. To address these limitations, we propose a logits-based fine-tuning framework that integrates the strengths of supervised learning and knowledge distillation. Our approach constructs enriched training targets by combining teacher logits with ground truth labels, preserving both correctness and linguistic diversity. This ensures more reliable and effective training. We constructed a large-scale 1.2M logits dataset and trained a series of science-focused models. Experimental results demonstrate that our method achieves significant improvements, with accuracy gains of 18% on Mawps and 22.7% on TabMWP. Across…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Machine Learning and Data Classification
