Distributional Clarity: The Hidden Driver of RL-Friendliness in Large Language Models
Shaoning Sun, Mingzhu Cai, Huang He, Bingjin Chen, Siqi Bao, Yujiu Yang, Hua Wu, Haifeng Wang

TL;DR
This paper identifies distributional clarity in probability space as a key factor influencing the reinforcement learning benefits of large language models, and proposes a method to enhance it, leading to improved model performance.
Contribution
It introduces the concept of distributional clarity, quantifies it using the Silhouette Coefficient, and develops a reweighting strategy to improve RL-friendliness in language models.
Findings
High distributional clarity correlates with better RL performance.
Low clarity is linked to logic errors and reasoning instability.
Reweighting low-clarity samples improves model performance on benchmarks.
Abstract
Language model families exhibit striking disparity in their capacity to benefit from reinforcement learning: under identical training, models like Qwen achieve substantial gains, while others like Llama yield limited improvements. Complementing data-centric approaches, we reveal that this disparity reflects a hidden structural property: \textbf{distributional clarity} in probability space. Through a three-stage analysis-from phenomenon to mechanism to interpretation-we uncover that RL-friendly models exhibit intra-class compactness and inter-class separation in their probability assignments to correct vs. incorrect responses. We quantify this clarity using the \textbf{Silhouette Coefficient} () and demonstrate that (1) high correlates strongly with RL performance; (2) low is associated with severe logic errors and reasoning instability. To confirm this property, we introduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
