FinCriticalED: A Visual Benchmark for Financial Fact-Level OCR

Yueru He; Xueqing Peng; Yupeng Cao; Yan Wang; Lingfei Qian; Haohang Li; Yi Han; Shuyao Wang; Ruoyu Xiang; Fan Zhang; Zhuohan Xie; Mingquan Lin; Prayag Tiwari; Jimin Huang; Guojun Xiong; Sophia Ananiadou

arXiv:2511.14998·cs.CV·April 8, 2026

FinCriticalED: A Visual Benchmark for Financial Fact-Level OCR

Yueru He, Xueqing Peng, Yupeng Cao, Yan Wang, Lingfei Qian, Haohang Li, Yi Han, Shuyao Wang, Ruoyu Xiang, Fan Zhang, Zhuohan Xie, Mingquan Lin, Prayag Tiwari, Jimin Huang, Guojun Xiong, Sophia Ananiadou

PDF

1 Repo 1 Datasets

TL;DR

FinCriticalED is a new visual benchmark designed to evaluate whether OCR and vision-language models accurately preserve critical financial evidence beyond lexical similarity, highlighting gaps in current systems' reliability.

Contribution

The paper introduces FinCriticalED, a fact-centric benchmark with expert-annotated financial facts, and develops a protocol for assessing model fidelity in financial OCR tasks.

Findings

01

Numerical values and monetary units are most vulnerable to errors.

02

Critical errors are concentrated in visually complex, mixed-layout documents.

03

There is a significant gap between lexical accuracy and factual reliability in current models.

Abstract

Recent progress in multimodal large language models (MLLMs) has substantially improved document understanding, yet strong optical character recognition (OCR) performance on surface metrics does not guarantee faithful preservation of decision-critical evidence. This limitation is especially consequential in financial documents, where small visual errors can induce discrete shifts in meaning. To study this gap, we introduce FinCriticalED (Financial Critical Error Detection), a fact-centric visual benchmark for evaluating whether OCR and vision-language systems preserve financially critical evidence beyond lexical similarity. FinCriticalED contains 859 real-world financial document pages with 9,481 expert-annotated facts spanning five critical field types: numeric, temporal, monetary unit, reporting entity, and financial concept. We formulate the task as structured OCR with fact-level…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://the-finai.github.io/FinCriticalED
github

Datasets

TheFinAI/FinCriticalED
dataset· 98 dl
98 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.