FAITH: A Framework for Assessing Intrinsic Tabular Hallucinations in Finance
Mengao Zhang, Jiayu Fu, Tanya Warrier, Yuwen Wang, Tianhui Tan, Ke-wei Huang

TL;DR
This paper introduces FAITH, a comprehensive framework for evaluating intrinsic hallucinations in financial LLMs, focusing on real-world tabular data to improve reliability in financial decision-making.
Contribution
It presents a novel, scalable evaluation framework with datasets and methods specifically designed for assessing hallucinations in financial language models.
Findings
State-of-the-art LLMs exhibit significant hallucination patterns in financial tabular data.
The masking-based dataset creation method effectively captures intrinsic hallucinations.
Evaluation results highlight areas for improving model reliability in finance.
Abstract
Hallucination remains a critical challenge for deploying Large Language Models (LLMs) in finance. Accurate extraction and precise calculation from tabular data are essential for reliable financial analysis, since even minor numerical errors can undermine decision-making and regulatory compliance. Financial applications have unique requirements, often relying on context-dependent, numerical, and proprietary tabular data that existing hallucination benchmarks rarely capture. In this study, we develop a rigorous and scalable framework for evaluating intrinsic hallucinations in financial LLMs, conceptualized as a context-aware masked span prediction task over real-world financial documents. Our main contributions are: (1) a novel, automated dataset creation paradigm using a masking strategy; (2) a new hallucination evaluation dataset derived from S&P 500 annual reports; and (3) a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
