More Rounds, More Noise: Why Multi-Turn Review Fails to Improve Cross-Context Verification
Song Tae-Eun

TL;DR
Multi-turn cross-context review for LLM verification introduces more noise and false positives, reducing overall accuracy compared to single-pass review, due to false positive pressure and conversation drift.
Contribution
This paper demonstrates that multi-turn review strategies do not improve, and can worsen, LLM verification performance compared to single-pass methods, highlighting the challenges of iterative review.
Findings
Single-pass CCR outperforms multi-turn variants in F1 score.
Multi-turn review increases false positives and reduces precision.
Re-review without prior context performs worst, indicating repetition harms accuracy.
Abstract
Cross-Context Review (CCR) improves LLM verification by separating production and review into independent sessions. A natural extension is multi-turn review: letting the reviewer ask follow-up questions, receive author responses, and review again. We call this Dynamic Cross-Context Review (D-CCR). In a controlled experiment with 30 artifacts and 150 injected errors, we tested four D-CCR variants against the single-pass CCR baseline. Single-pass CCR (F1 = 0.376) significantly outperformed all multi-turn variants, including D-CCR-2b with question-and-answer exchange (F1 = 0.303, , ). Multi-turn review increased recall (+0.08) but generated 62% more false positives (8.5 vs. 5.2), collapsing precision from 0.30 to 0.20. Two mechanisms drive this degradation: (1) false positive pressure -- reviewers in later rounds fabricate findings when the artifact's real errors have…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Deception detection and forensic psychology
