Quantifying Cross-Query Contradictions in Multi-Query LLM Reasoning

Rohit Kumar Salla; Ramya Manasa Amancherla; Manoj Saravanan

arXiv:2604.14525·cs.AI·April 17, 2026

Quantifying Cross-Query Contradictions in Multi-Query LLM Reasoning

Rohit Kumar Salla, Ramya Manasa Amancherla, Manoj Saravanan

PDF

TL;DR

This paper introduces a benchmark and a solver-augmented method to measure and improve logical consistency across multiple related queries in large language models, enhancing global coherence.

Contribution

It presents a new benchmark with metrics for multi-query reasoning and a solver-based approach to reduce contradictions, improving global consistency in LLM reasoning.

Findings

01

Substantially reduces cross-query contradictions (SetCons: 0.56 to 0.94)

02

Maintains per-query accuracy while improving global coherence

03

Across four reasoning domains, demonstrates the importance of global consistency

Abstract

Large language models frequently produce mutually inconsistent answers when reasoning over multiple related queries. We study case-file logical consistency: maintaining a globally satisfiable belief state across interdependent queries. We introduce a benchmark of 390 multi-query reasoning instances with entailment/contradiction/unknown labels and propose set-level metrics including Case Satisfiability Rate, Contradiction Density and Revision Cost. Our solver-augmented approach extracts commitments, verifies global satisfiability and performs counterexample-guided repair. Across four reasoning domains, our method substantially reduces cross-query contradictions (SetCons: 0.56 to 0.94) while preserving per-query accuracy, demonstrating that global coherence is critical for robust multi-query reasoning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.