Adaptive Stopping for Multi-Turn LLM Reasoning
Xiaofan Zhou, Huy Nguyen, Bo Yu, Chenxi Liu, Lu Cheng

TL;DR
This paper introduces MiCP, a conformal prediction framework for multi-turn LLM reasoning that guarantees coverage while optimizing for fewer turns and lower inference costs.
Contribution
MiCP is the first conformal prediction method designed for multi-turn reasoning, enabling adaptive stopping with formal coverage guarantees.
Findings
MiCP achieves target coverage on QA benchmarks.
MiCP reduces the number of reasoning turns and inference costs.
MiCP maintains prediction set validity while improving efficiency.
Abstract
Large Language Models (LLMs) increasingly rely on multi-turn reasoning and interaction, such as adaptive retrieval-augmented generation (RAG) and ReAct-style agents, to answer difficult questions. These methods improve accuracy by iteratively retrieving information, reasoning, or acting, but introduce a key challenge: \textbf{When should the model stop?} Existing approaches rely on heuristic stopping rules or fixed turn budgets and provide no formal guarantees that the final prediction still contains the correct answer. This limitation is particularly problematic in high-stakes domains such as finance and healthcare, where unnecessary turns increase cost and latency, while stopping too early risks incorrect decisions. Conformal prediction (CP) provides formal coverage guarantees, but existing LLM-CP methods only apply to a single model output and cannot handle multi-turn pipelines with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
