Annotations Mitigate Post-Training Mode Collapse

Jacob Mitchell Springer; Madhu Advani; Lukas Aichberger; Arwen Bradley; Eran Malach; Omid Saremi; Sinead Williamson; Preetum Nakkiran; Etai Littwin; and Aditi Raghunathan

arXiv:2605.09995·cs.CL·May 12, 2026

Annotations Mitigate Post-Training Mode Collapse

Jacob Mitchell Springer, Madhu Advani, Lukas Aichberger, Arwen Bradley, Eran Malach, Omid Saremi, Sinead Williamson, Preetum Nakkiran, Etai Littwin, and Aditi Raghunathan

PDF

TL;DR

The paper introduces annotation-anchored training, a method to prevent semantic mode collapse in post-training models by preserving pretraining diversity through annotated data, leading to more diverse and faithful instruction-following.

Contribution

It proposes a novel annotation-anchored training approach that maintains pretraining diversity during post-training, reducing mode collapse and enhancing model performance.

Findings

01

Models with annotation-anchored training achieve 6x less diversity collapse.

02

The method scales effectively, improving with larger models.

03

Annotation-anchored training preserves semantic richness during post-training.

Abstract

Post-training (via supervised fine-tuning) improves instruction-following, but often induces semantic mode collapse by biasing models toward low-entropy fine-tuning data at the expense of the high-entropy pretraining distribution. Crucially, we find this trade-off worsens with scale. To close this semantic diversity gap, we propose annotation-anchored training, a principled method that enables models to adopt the preference-following behaviors of post-training without sacrificing the inherent diversity of pretraining. Our approach is simple: we pretrain on documents paired with semantic annotations, inducing a rich annotation distribution that reflects the full breadth of pretraining data, and we preserve this distribution during post-training. This lets us sample diverse annotations at inference time and use them as anchors to guide generation, effectively transferring pretraining's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.