Probability-Entropy Calibration: An Elastic Indicator for Adaptive Fine-tuning

Wenhao Yu; Shaohang Wei; Jiahong Liu; Yifan Li; Minda Hu; Aiwei Liu; Hao Zhang; Irwin King

arXiv:2602.01745·cs.LG·February 3, 2026

Probability-Entropy Calibration: An Elastic Indicator for Adaptive Fine-tuning

Wenhao Yu, Shaohang Wei, Jiahong Liu, Yifan Li, Minda Hu, Aiwei Liu, Hao Zhang, Irwin King

PDF

Open Access

TL;DR

This paper introduces a probability-entropy calibration method called RankTuner that improves fine-tuning of language models by better identifying truly under-learned tokens, leading to enhanced reasoning and code generation performance.

Contribution

The paper proposes the Relative Rank Indicator for adaptive token reweighting, combining probability and entropy to improve fine-tuning effectiveness.

Findings

01

Consistent improvements on mathematical reasoning benchmarks

02

Enhanced transfer performance on out-of-distribution reasoning

03

Better pre-code generation results over baseline reweighting methods

Abstract

Token-level reweighting is a simple yet effective mechanism for controlling supervised fine-tuning, but common indicators are largely one-dimensional: the ground-truth probability reflects downstream alignment, while token entropy reflects intrinsic uncertainty induced by the pre-training prior. Ignoring entropy can misidentify noisy or easily replaceable tokens as learning-critical, while ignoring probability fails to reflect target-specific alignment. RankTuner introduces a probability--entropy calibration signal, the Relative Rank Indicator, which compares the rank of the ground-truth token with its expected rank under the prediction distribution. The inverse indicator is used as a token-wise Relative Scale to reweight the fine-tuning objective, focusing updates on truly under-learned tokens without over-penalizing intrinsically uncertain positions. Experiments on multiple backbones…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Machine Learning and Data Classification