Utilizing Training Data to Improve LLM Reasoning for Tabular Understanding

Chufan Gao; Jintai Chen; Jimeng Sun

arXiv:2508.18676·cs.LG·August 27, 2025

Utilizing Training Data to Improve LLM Reasoning for Tabular Understanding

Chufan Gao, Jintai Chen, Jimeng Sun

PDF

TL;DR

This paper introduces LRTab, a novel prompting-based method that combines training data insights with retrieval techniques to enhance large language model reasoning for tabular data understanding, outperforming previous methods.

Contribution

LRTab integrates training data insights with retrieval to improve LLM reasoning in tabular tasks, balancing generalizability and dataset-specific learning.

Findings

01

LRTab outperforms previous baselines on WikiTQ and Tabfact datasets.

02

LRTab is interpretable and cost-efficient.

03

Retrieving relevant prompt conditions enhances reasoning accuracy.

Abstract

Automated tabular understanding and reasoning are essential tasks for data scientists. Recently, Large language models (LLMs) have become increasingly prevalent in tabular reasoning tasks. Previous work focuses on (1) finetuning LLMs using labeled data or (2) Training-free prompting LLM agents using chain-of-thought (CoT). Finetuning offers dataset-specific learning at the cost of generalizability. Training-free prompting is highly generalizable but does not take full advantage of training data. In this paper, we propose a novel prompting-based reasoning approach, Learn then Retrieve: LRTab, which integrates the benefits of both by retrieving relevant information learned from training data. We first use prompting to obtain CoT responses over the training data. For incorrect CoTs, we prompt the LLM to predict Prompt Conditions to avoid the error, learning insights from the data. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.