FAITH: A Framework for Assessing Intrinsic Tabular Hallucinations in Finance

Mengao Zhang; Jiayu Fu; Tanya Warrier; Yuwen Wang; Tianhui Tan; Ke-wei Huang

arXiv:2508.05201·cs.LG·October 27, 2025

FAITH: A Framework for Assessing Intrinsic Tabular Hallucinations in Finance

Mengao Zhang, Jiayu Fu, Tanya Warrier, Yuwen Wang, Tianhui Tan, Ke-wei Huang

PDF

TL;DR

This paper introduces FAITH, a comprehensive framework for evaluating intrinsic hallucinations in financial LLMs, focusing on real-world tabular data to improve reliability in financial decision-making.

Contribution

It presents a novel, scalable evaluation framework with datasets and methods specifically designed for assessing hallucinations in financial language models.

Findings

01

State-of-the-art LLMs exhibit significant hallucination patterns in financial tabular data.

02

The masking-based dataset creation method effectively captures intrinsic hallucinations.

03

Evaluation results highlight areas for improving model reliability in finance.

Abstract

Hallucination remains a critical challenge for deploying Large Language Models (LLMs) in finance. Accurate extraction and precise calculation from tabular data are essential for reliable financial analysis, since even minor numerical errors can undermine decision-making and regulatory compliance. Financial applications have unique requirements, often relying on context-dependent, numerical, and proprietary tabular data that existing hallucination benchmarks rarely capture. In this study, we develop a rigorous and scalable framework for evaluating intrinsic hallucinations in financial LLMs, conceptualized as a context-aware masked span prediction task over real-world financial documents. Our main contributions are: (1) a novel, automated dataset creation paradigm using a masking strategy; (2) a new hallucination evaluation dataset derived from S&P 500 annual reports; and (3) a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.