FinChart-Bench: Benchmarking Financial Chart Comprehension in Vision-Language Models

Dong Shu; Haoyang Yuan; Yuchen Wang; Yanguang Liu; Huopu Zhang; Haiyan Zhao; Mengnan Du

arXiv:2507.14823·cs.CV·July 22, 2025

FinChart-Bench: Benchmarking Financial Chart Comprehension in Vision-Language Models

Dong Shu, Haoyang Yuan, Yuchen Wang, Yanguang Liu, Huopu Zhang, Haiyan Zhao, Mengnan Du

PDF

1 Datasets

TL;DR

FinChart-Bench is a new benchmark dataset designed to evaluate the ability of vision-language models to understand complex financial charts, revealing current limitations in model performance and reasoning skills.

Contribution

This paper introduces FinChart-Bench, the first specialized benchmark for financial chart comprehension in vision-language models, with extensive annotations and evaluation of 25 models.

Findings

01

Performance gap between open-source and closed-source models is decreasing.

02

Models show performance degradation in upgraded versions.

03

Many models struggle with spatial reasoning and instruction following.

Abstract

Large vision-language models (LVLMs) have made significant progress in chart understanding. However, financial charts, characterized by complex temporal structures and domain-specific terminology, remain notably underexplored. We introduce FinChart-Bench, the first benchmark specifically focused on real-world financial charts. FinChart-Bench comprises 1,200 financial chart images collected from 2015 to 2024, each annotated with True/False (TF), Multiple Choice (MC), and Question Answering (QA) questions, totaling 7,016 questions. We conduct a comprehensive evaluation of 25 state-of-the-art LVLMs on FinChart-Bench. Our evaluation reveals critical insights: (1) the performance gap between open-source and closed-source models is narrowing, (2) performance degradation occurs in upgraded models within families, (3) many models struggle with instruction following, (4) both advanced models…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Tizzzzy/FinChart-Bench
dataset· 215 dl
215 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.