FCMBench: The First Large-scale Financial Credit Multimodal Benchmark for Real-world Applications
Yehui Yang, Dalu Yang, Fangxin Shang, Wenshuo Zhou, Jie Ren, Yifan Liu, Haojun Fei, Qing Yang, Yanwu Xu, Tao Chen

TL;DR
FCMBench is a comprehensive, privacy-preserving multimodal benchmark designed for real-world financial credit applications, evaluating perception, reasoning, and robustness of vision-language models with extensive real-world challenges.
Contribution
This paper introduces FCMBench, the first large-scale, privacy-compliant multimodal benchmark specifically for financial credit applications, with extensive evaluation of state-of-the-art models.
Findings
Gemini 3 Pro achieves 65.16 F1 score
Kimi-K2.5 achieves 60.58 F1 score
Models show performance degradation under robustness challenges
Abstract
FCMBench is the first large-scale and privacy-compliant multimodal benchmark for real-world financial credit applications, covering tasks and robustness challenges from domain specific workflows and constraints. The current version of FCMBench covers 26 certificate types, with 5198 privacy-compliant images and 13806 paired VQA samples. It evaluates models on Perception and Reasoning tasks under real-world Robustness interferences, including 3 foundational perception tasks, 4 credit-specific reasoning tasks demanding decision-oriented visual evidence interpretation, and 10 real-world challenges for rigorous robustness stress testing. Moreover, FCMBench offers privacy-compliant realism with minimal leakage risk through in-house scenario-aware captures of manually synthesized templates, without any publicly released images. We conduct extensive evaluations of 28 state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFinancial Distress and Bankruptcy Prediction · Explainable Artificial Intelligence (XAI) · Imbalanced Data Classification Techniques
