Improved Evidence Extraction and Metrics for Document Inconsistency Detection with LLMs

Nelvin Tan; Yaowen Zhang; James Asikin Cheung; Fusheng Liu; Yu-Ching Shih; Dong Yang

arXiv:2601.02627·cs.CL·April 9, 2026

Improved Evidence Extraction and Metrics for Document Inconsistency Detection with LLMs

Nelvin Tan, Yaowen Zhang, James Asikin Cheung, Fusheng Liu, Yu-Ching Shih, Dong Yang

PDF

TL;DR

This paper enhances document inconsistency detection by improving evidence extraction methods using LLMs, introducing new metrics and a redact-and-retry framework, supported by experimental results and a new dataset.

Contribution

It presents novel evidence-extraction metrics and a redact-and-retry framework that significantly improve LLM-based inconsistency detection performance.

Findings

01

Evidence extraction performance is substantially improved with the proposed framework.

02

New semi-synthetic dataset effectively evaluates evidence extraction methods.

03

The approach outperforms existing prompting techniques in accuracy.

Abstract

Large language models (LLMs) are becoming useful in many domains due to their impressive abilities that arise from large training datasets and large model sizes. However, research on LLM-based approaches to document inconsistency detection is relatively limited. We address this gap by investigating evidence extraction capabilties of LLMs for document inconsistency detection. To this end, we introduce new comprehensive evidence-extraction metrics and a redact-and-retry framework with constrained filtering that substantially improves evidence extraction performance over other prompting methods. We support our approach with strong experimental results and release a new semi-synthetic dataset for evaluating evidence extraction.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.