FIRE: A Comprehensive Benchmark for Financial Intelligence and Reasoning Evaluation
Xiyuan Zhang, Huihang Wu, Jiayu Guo, Zhenlin Zhang, Yiwei Zhang, Liangyu Huo, Xiaoxiao Ma, Jiansong Wan, Xuewei Jiao, Yi Jing, Jian Xie

TL;DR
FIRE is a comprehensive benchmark for evaluating large language models' financial knowledge and reasoning abilities through theoretical exams and practical financial scenarios, aiding future research.
Contribution
Introduces FIRE, a new benchmark combining theoretical financial exams and real-world scenarios to assess LLMs' financial intelligence and reasoning capabilities.
Findings
State-of-the-art LLMs show varying performance across financial tasks.
The benchmark reveals current limitations in LLMs' financial reasoning.
Our financial-domain model XuanYuan 4.0 performs strongly on the benchmark.
Abstract
We introduce FIRE, a comprehensive benchmark designed to evaluate both the theoretical financial knowledge of LLMs and their ability to handle practical business scenarios. For theoretical assessment, we curate a diverse set of examination questions drawn from widely recognized financial qualification exams, enabling evaluation of LLMs deep understanding and application of financial knowledge. In addition, to assess the practical value of LLMs in real-world financial tasks, we propose a systematic evaluation matrix that categorizes complex financial domains and ensures coverage of essential subdomains and business activities. Based on this evaluation matrix, we collect 3,000 financial scenario questions, consisting of closed-form decision questions with reference answers and open-ended questions evaluated by predefined rubrics. We conduct comprehensive evaluations of state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Financial Distress and Bankruptcy Prediction · Auditing, Earnings Management, Governance
