Detecting AI Hallucinations in Finance: An Information-Theoretic Method Cuts Hallucination Rate by 92%
Mainak Singha

TL;DR
This paper introduces ECLIPSE, an information-theoretic framework that significantly reduces hallucinations in large language models for finance by detecting unsupported answers through entropy and evidence utilization analysis.
Contribution
ECLIPSE is a novel entropy-capacity method that effectively detects hallucinations in LLMs by combining entropy estimation and evidence utilization metrics, with proven convexity and stability.
Findings
ECLIPSE achieves ROC AUC of 0.89 and precision of 0.90 on financial QA dataset.
It outperforms entropy-only baseline with AUC of 0.50.
Effectiveness depends on calibrated token-level uncertainties, confirmed by ablation studies.
Abstract
Large language models (LLMs) produce fluent but unsupported answers - hallucinations - limiting safe deployment in high-stakes domains. We propose ECLIPSE, a framework that treats hallucination as a mismatch between a model's semantic entropy and the capacity of available evidence. We combine entropy estimation via multi-sample clustering with a novel perplexity decomposition that measures how models use retrieved evidence. We prove that under mild conditions, the resulting entropy-capacity objective is strictly convex with a unique stable optimum. We evaluate on a controlled financial question answering dataset with GPT-3.5-turbo (n=200 balanced samples with synthetic hallucinations), where ECLIPSE achieves ROC AUC of 0.89 and average precision of 0.90, substantially outperforming a semantic entropy-only baseline (AUC 0.50). A controlled ablation with Claude-3-Haiku, which lacks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Advanced Graph Neural Networks
