From Generic Correlation to Input-Specific Credit in On-Policy Self Distillation

Guobin Shen; Lei Huang; Xiang Cheng; Chenxiao Zhao; Jindong Li; Dongcheng Zhao; Xing Yu

arXiv:2605.11613·cs.LG·May 13, 2026

From Generic Correlation to Input-Specific Credit in On-Policy Self Distillation

Guobin Shen, Lei Huang, Xiang Cheng, Chenxiao Zhao, Jindong Li, Dongcheng Zhao, Xing Yu

PDF

TL;DR

This paper analyzes on-policy self-distillation rewards in language models, revealing they measure pointwise mutual information and proposing CREDIT to isolate input-specific credit, improving performance across benchmarks.

Contribution

It provides a Bayesian filtering interpretation of self-distillation rewards and introduces CREDIT, a method to focus on input-specific information, enhancing model performance.

Findings

01

CREDIT isolates input-specific reward components effectively.

02

CREDIT achieves strong performance across multiple benchmarks.

03

The reward corresponds to a Bayesian filtering increment related to mutual information.

Abstract

On-policy self-distillation has emerged as a promising paradigm for post-training language models, in which the model conditions on environment feedback to serve as its own teacher, providing dense token-level rewards without external teacher models or step-level annotations. Despite its empirical success, what this reward actually measures and what kind of credit it assigns remain unclear. Under a posterior-compatibility interpretation of feedback conditioning, standard in the implicit-reward literature, we show that the self-distillation token reward is a Bayesian filtering increment whose trajectory sum is exactly the pointwise mutual information between the response and the feedback given the input. This pMI can be raised by input-specific reasoning or by input-generic shortcuts, so we further decompose the teacher log-probability along the input axis. Based on this analysis, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.