Reducing Hallucinations in LLM-based Scientific Literature Analysis Using Peer Context Outlier Detection
Daniel Xie, Maxwell J. Jacobson, Adil Wazeer, Haiyan Wang, Xinghang Zhang, Yexiang Xue

TL;DR
This paper presents P-COD, a novel peer context outlier detection method that leverages relationships between scientific papers to reduce hallucinations in LLM-based literature analysis, achieving up to 98% precision.
Contribution
Introduction of P-COD, a new approach that uses peer relationships to improve data extraction accuracy and reduce hallucinations in scientific literature summarization.
Findings
Achieved up to 98% precision in outlier detection across 6 scientific domains.
Reduced hallucinations and improved trust in automated literature analysis.
Enabled focus on ambiguous cases, streamlining data workflows.
Abstract
Reducing hallucinations in Large Language Models (LLMs) is essential for improving the accuracy of data extraction from large text corpora. Current methods, like prompt engineering and chain-of-thought prompting, focus on individual documents but fail to consider relationships across a corpus. This paper introduces Peer Context Outlier Detection (P-COD), a novel approach that uses the relationships between documents to improve extraction accuracy. Our application domain is in scientific literature summarization, where papers with similar experiment settings should draw similar conclusions. By comparing extracted data to validated peer information within the corpus, we adjust confidence scores and flag low-confidence results for expert review. High-confidence results, supported by peer validation, are considered reliable. Our experiments demonstrate up to 98% precision in outlier…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
