FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use
Jiaxuan Lu, Kong Wang, Yemin Wang, Qingmei Tang, Hongwei Zeng, Xiang Chen, Jiahao Pi, Shujian Deng, Lingzhi Chen, Yi Fu, Kehua Yang, Xiao Sun

TL;DR
FinToolBench introduces a comprehensive, real-world benchmark for evaluating LLM-based financial tools, emphasizing realistic tool interaction, compliance, and timeliness, to advance trustworthy AI in finance.
Contribution
It presents the first extensive, executable financial tool benchmark with a novel evaluation framework and a finance-aware retrieval baseline, addressing critical gaps in current financial AI assessments.
Findings
Established a realistic ecosystem with 760 financial tools.
Developed a multi-dimensional evaluation framework.
Proposed a finance-aware tool retrieval baseline.
Abstract
The integration of Large Language Models (LLMs) into the financial domain is driving a paradigm shift from passive information retrieval to dynamic, agentic interaction. While general-purpose tool learning has witnessed a surge in benchmarks, the financial sector, characterized by high stakes, strict compliance, and rapid data volatility, remains critically underserved. Existing financial evaluations predominantly focus on static textual analysis or document-based QA, ignoring the complex reality of tool execution. Conversely, general tool benchmarks lack the domain-specific rigor required for finance, often relying on toy environments or a negligible number of financial APIs. To bridge this gap, we introduce FinToolBench, the first real-world, runnable benchmark dedicated to evaluating financial tool learning agents. Unlike prior works limited to a handful of mock tools, FinToolBench…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStock Market Forecasting Methods · FinTech, Crowdfunding, Digital Finance · Financial Reporting and XBRL
