TL;DR
This paper investigates where output diversity collapses in post-training language models, revealing that data composition during training primarily causes the collapse, which cannot be fixed solely at inference.
Contribution
It disentangles the effects of training data and generation format on diversity collapse, providing a detailed analysis across multiple post-training methods and tasks.
Findings
Diversity collapse varies with data composition and training lineage.
Suppressing reasoning at inference reduces accuracy but not diversity.
Diversity loss is embedded in model weights, not just generation format.
Abstract
Post-trained language models produce less varied outputs than their base counterparts. This output diversity collapse undermines inference-time scaling methods that rely on varied samples, and risks homogenizing model outputs on creative and value-laden tasks. Prior work attributes collapse to specific post-training methods, without separating the role of training data composition from the method, or the generation format from the model weights. We trace output diversity through three parallel post-training lineages of Olmo 3, Think (chain-of-thought distillation), Instruct (broad multi-source data), and RL-Zero, across 15 tasks and four text diversity metrics. We find that the location of collapse co-varies with data composition: the Think lineage loses most semantic diversity at supervised fine-tuning, and the effect of DPO is larger in Instruct than in Think. Suppressing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
