TL;DR
This paper introduces inference-time scaling methods for table reasoning, including distillation and reinforcement learning, enabling smaller models to perform comparably to larger models on diverse table reasoning tasks.
Contribution
It presents the first study on inference-time scaling for table reasoning, developing two strategies—distillation from reasoning traces and RLVR—to improve small model performance.
Findings
Table-R1-Zero matches/exceeds GPT-4.1 performance with only 7B parameters.
Models demonstrate strong out-of-domain generalization.
Instruction tuning and architecture choices enhance reasoning skills.
Abstract
In this work, we present the first study to explore inference-time scaling on table reasoning tasks. We develop and evaluate two post-training strategies to enable inference-time scaling: distillation from frontier model reasoning traces and reinforcement learning with verifiable rewards (RLVR). For distillation, we introduce a large-scale dataset of reasoning traces generated by DeepSeek-R1, which we use to fine-tune LLMs into the Table-R1-SFT model. For RLVR, we propose task-specific verifiable reward functions and apply the GRPO algorithm to obtain the Table-R1-Zero model. We evaluate our Table-R1-series models across diverse table reasoning tasks, including short-form QA, fact verification, and free-form QA. Notably, the Table-R1-Zero model matches or exceeds the performance of GPT-4.1 and DeepSeek-R1, while using only a 7B-parameter LLM. It also demonstrates strong generalization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAttention Is All You Need · Linear Layer · Adam · Dense Connections · Softmax · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Label Smoothing · Multi-Head Attention · Byte Pair Encoding
