Co-Sight: Enhancing LLM-Based Agents via Conflict-Aware Meta-Verification and Trustworthy Reasoning with Structured Facts

Hongwei Zhang; Ji Lu; Shiqing Jiang; Chenxiang Zhu; Li Xie; Chen Zhong; Haoran Chen; Yurui Zhu; Yongsheng Du; Yanqin Gao; Lingjun Huang; Baoli Wang; Fang Tan; and Peng Zou

arXiv:2510.21557·cs.AI·October 27, 2025

Co-Sight: Enhancing LLM-Based Agents via Conflict-Aware Meta-Verification and Trustworthy Reasoning with Structured Facts

Hongwei Zhang, Ji Lu, Shiqing Jiang, Chenxiang Zhu, Li Xie, Chen Zhong, Haoran Chen, Yurui Zhu, Yongsheng Du, Yanqin Gao, Lingjun Huang, Baoli Wang, Fang Tan, and Peng Zou

PDF

TL;DR

Co-Sight introduces a conflict-aware verification framework with structured facts to improve the reliability and transparency of long-horizon reasoning in LLM-based agents, achieving state-of-the-art accuracy and efficiency.

Contribution

It presents a novel conflict-aware meta-verification method combined with structured factual reasoning to enhance trustworthiness and efficiency in LLM reasoning processes.

Findings

01

Achieves 84.4% accuracy on GAIA dataset.

02

Attains 93.8% accuracy on Chinese-SimpleQA.

03

Demonstrates improved efficiency and reliability through conflict-focused verification.

Abstract

Long-horizon reasoning in LLM-based agents often fails not from generative weakness but from insufficient verification of intermediate reasoning. Co-Sight addresses this challenge by turning reasoning into a falsifiable and auditable process through two complementary mechanisms: Conflict-Aware Meta-Verification (CAMV) and Trustworthy Reasoning with Structured Facts (TRSF). CAMV reformulates verification as conflict identification and targeted falsification, allocating computation only to disagreement hotspots among expert agents rather than to full reasoning chains. This bounds verification cost to the number of inconsistencies and improves efficiency and reliability. TRSF continuously organizes, validates, and synchronizes evidence across agents through a structured facts module. By maintaining verified, traceable, and auditable knowledge, it ensures that all reasoning is grounded in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.