Why Pass@k Optimization Can Degrade Pass@1: Prompt Interference in LLM Post-training
Anas Barakat, Souradip Chakraborty, Khushbu Pahwa, Amrit Singh Bedi

TL;DR
This paper investigates why optimizing for pass@k in large language models can negatively impact pass@1, revealing that gradient conflicts caused by prompt interference lead to this degradation.
Contribution
It provides a theoretical analysis of the gradient conflict mechanism behind pass@k optimization degrading pass@1, supported by experiments on mathematical reasoning tasks.
Findings
Pass@k optimization can conflict with pass@1 due to gradient interference.
Prompt reweighting toward low-success prompts causes negative interference.
Theoretical characterization explains the trade-off observed in practice.
Abstract
Pass@k is a widely used performance metric for verifiable large language model tasks, including mathematical reasoning, code generation, and short-answer reasoning. It defines success if any of independently sampled solutions passes a verifier. This multi-sample inference metric has motivated inference-aware fine-tuning methods that directly optimize pass@. However, prior work reports a recurring trade-off: pass@k improves while pass@1 degrades under such methods. This trade-off is practically important because pass@1 often remains a hard operational constraint due to latency and cost budgets, imperfect verifier coverage, and the need for a reliable single-shot fallback. We study the origin of this trade-off and provide a theoretical characterization of when pass@k policy optimization can reduce pass@1 through gradient conflict induced by prompt interference. We show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Algorithms
