Document-Level Numerical Reasoning across Single and Multiple Tables in Financial Reports

Yi-Cheng Wang; Wei-An Wang; Chu-Song Chen

arXiv:2604.03664·cs.CL·April 7, 2026

Document-Level Numerical Reasoning across Single and Multiple Tables in Financial Reports

Yi-Cheng Wang, Wei-An Wang, Chu-Song Chen

PDF

TL;DR

This paper introduces FinLongDocQA, a dataset for numerical reasoning in long financial reports, and proposes FinLongDocAgent, a multi-agent retrieval-augmented method to improve question answering accuracy.

Contribution

The paper presents a new dataset for cross-table numerical reasoning in financial reports and a novel multi-agent RAG approach to enhance reasoning accuracy over long documents.

Findings

01

LLMs struggle with long financial reports exceeding 129k tokens.

02

Iterative retrieval and verification significantly improve numerical QA accuracy.

03

The proposed FinLongDocAgent outperforms baseline methods in experiments.

Abstract

Despite the strong language understanding abilities of large language models (LLMs), they still struggle with reliable question answering (QA) over long, structured documents, particularly for numerical reasoning. Financial annual reports exemplify this difficulty: financial statement analysis often hinges on accurate arithmetic, and analysts derive key indicators by integrating evidence scattered across multiple tables and narrative text. However, existing benchmarks focus largely on single-table settings, leaving cross-table document-level numerical reasoning underexplored. To address this gap, we introduce FinLongDocQA, a dataset for both single-table and cross-table financial numerical reasoning in long-context reports. Evaluating both closed-source and open-source LLMs on FinLongDocQA reveals two bottlenecks: (1) annual reports often exceed 129k tokens, exacerbating the context rot…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.