FinTextQA: A Dataset for Long-form Financial Question Answering
Jian Chen, Peilin Zhou, Yining Hua, Yingxin Loh, Kehui Chen, Ziyuan, Li, Bing Zhu, Junwei Liang

TL;DR
FinTextQA introduces a new comprehensive dataset for long-form financial question answering and evaluates retrieval-augmented generation systems, highlighting their performance and robustness under noisy conditions.
Contribution
This paper presents FinTextQA, a novel dataset for financial LFQA, and benchmarks a RAG-based system, demonstrating its effectiveness and resilience to noise.
Findings
Baichuan2-7B performs comparably to GPT-3.5-turbo in accuracy.
Optimal system configuration includes Ada2, Automated Merged Retrieval, Bge-Reranker-Base, and Baichuan2-7B.
Models become less affected by noise when context length exceeds a certain threshold.
Abstract
Accurate evaluation of financial question answering (QA) systems necessitates a comprehensive dataset encompassing diverse question types and contexts. However, current financial QA datasets lack scope diversity and question complexity. This work introduces FinTextQA, a novel dataset for long-form question answering (LFQA) in finance. FinTextQA comprises 1,262 high-quality, source-attributed QA pairs extracted and selected from finance textbooks and government agency websites.Moreover, we developed a Retrieval-Augmented Generation (RAG)-based LFQA system, comprising an embedder, retriever, reranker, and generator. A multi-faceted evaluation approach, including human ranking, automatic metrics, and GPT-4 scoring, was employed to benchmark the performance of different LFQA system configurations under heightened noisy conditions. The results indicate that: (1) Among all compared…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStock Market Forecasting Methods · Financial Reporting and XBRL · FinTech, Crowdfunding, Digital Finance
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Residual Connection · Absolute Position Encodings · Byte Pair Encoding
