Evidence-Bound Autonomous Research (EviBound): A Governance Framework for Eliminating False Claims

Ruiying Chen

arXiv:2511.05524·cs.AI·November 11, 2025·2 cites

Evidence-Bound Autonomous Research (EviBound): A Governance Framework for Eliminating False Claims

Ruiying Chen

PDF

Open Access

TL;DR

EviBound is a governance framework that uses dual evidence-based gates to eliminate false claims in autonomous research, ensuring verified results with minimal overhead.

Contribution

It introduces a dual governance gate system that enforces machine-checkable evidence before and after execution to prevent false claims in autonomous research agents.

Findings

01

EviBound achieves 0% hallucination in benchmark tasks.

02

Verification-only reduces hallucination to 25%.

03

Baseline prompt-level approach yields 100% hallucination.

Abstract

LLM-based autonomous research agents report false claims: tasks marked "complete" despite missing artifacts, contradictory metrics, or failed executions. EviBound is an evidence-bound execution framework that eliminates false claims through dual governance gates requiring machine-checkable evidence. Two complementary gates enforce evidence requirements. The pre-execution Approval Gate validates acceptance criteria schemas before code runs, catching structural violations proactively. The post-execution Verification Gate validates artifacts via MLflow API queries (with recursive path checking) and optionally validates metrics when specified by acceptance criteria. Claims propagate only when backed by a queryable run ID, required artifacts, and FINISHED status. Bounded, confidence-gated retries (typically 1-2 attempts) recover from transient failures without unbounded loops. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Research Data Management Practices · Biomedical Text Mining and Ontologies