Synthetic Data-Driven Prompt Tuning for Financial QA over Tables and Documents

Yaoning Yu; Kai-Min Chang; Ye Yu; Kai Wei; Haojing Luo; Haohan Wang

arXiv:2511.06292·cs.AI·November 17, 2025

Synthetic Data-Driven Prompt Tuning for Financial QA over Tables and Documents

Yaoning Yu, Kai-Min Chang, Ye Yu, Kai Wei, Haojing Luo, Haohan Wang

PDF

Open Access

TL;DR

This paper presents a self-improving, synthetic data-driven prompt tuning framework that enhances financial reasoning capabilities of large language models on tables and documents without external labels.

Contribution

It introduces a closed-loop system combining synthetic data generation, verification, and prompt optimization to improve financial question-answering performance.

Findings

01

Achieves higher accuracy on DocMath-Eval benchmark

02

Improves robustness of financial reasoning prompts

03

Reduces reliance on manually labeled datasets

Abstract

Financial documents like earning reports or balance sheets often involve long tables and multi-page reports. Large language models have become a new tool to help numerical reasoning and understanding these documents. However, prompt quality can have a major effect on how well LLMs perform these financial reasoning tasks. Most current methods tune prompts on fixed datasets of financial text or tabular data, which limits their ability to adapt to new question types or document structures, or they involve costly and manually labeled/curated dataset to help build the prompts. We introduce a self-improving prompt framework driven by data-augmented optimization. In this closed-loop process, we generate synthetic financial tables and document excerpts, verify their correctness and robustness, and then update the prompt based on the results. Specifically, our framework combines a synthetic data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Stock Market Forecasting Methods · Advanced Text Analysis Techniques