Reducing Detail Hallucinations in Long-Context Regulatory Understanding via Targeted Preference Optimization
Yang Liu, Bin Chong, Yuhan Lin, Chongyang Zhang, Hao Zheng, Ziyi Zhang, Jiayu Liang, Ran Ran, Qian Li, Kefu Xu

TL;DR
This paper addresses detail hallucinations in large language models when processing long regulatory texts by formalizing error types, creating a benchmark, and proposing a targeted preference optimization method that significantly reduces errors.
Contribution
It introduces a detailed error taxonomy, a comprehensive benchmark, and a novel preference optimization framework called DetailDPO for improving factual accuracy in LLMs.
Findings
DetailDPO reduces detail error rate by 42-61% across models and contexts.
The benchmark includes 172 real and 150 synthetic documents with 13,000 annotated preference pairs.
The method improves accuracy across all five error types and transfers well to financial and medical domains.
Abstract
Large language models (LLMs) frequently produce \emph{detail hallucinations} when processing long regulatory documents, including subtle errors in threshold values, units, scopes, obligation levels, and conditions that preserve surface plausibility while corrupting safety-critical parameters. We formalize this phenomenon through a fine-grained \emph{Detail Error Taxonomy} of five error types and introduce \textbf{DetailBench}, a benchmark built from 172 real regulatory documents and 150 synthetic documents spanning three jurisdictions, with human-annotated detail-level ground truth comprising 13,000 preference pairs. We propose \textbf{DetailDPO}, a targeted preference optimization framework that constructs contrastive pairs differing in exactly one detail dimension, concentrating DPO gradient signal on detail-bearing~tokens. We provide theoretical analysis showing why \emph{minimal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
