Table-R1: Region-based Reinforcement Learning for Table Understanding
Zhenhe Wu, Jian Yang, Zhongjiang He, Changzai Pan, Jie Zhang, Jiaheng Liu, Xianjie Wu, Yu Zhao, Shuangyong Song, Yongxiang Li, Zhoujun Li, Xueling Li

TL;DR
This paper introduces Table-R1, a reinforcement learning method that improves large language models' understanding of tables by integrating region evidence, leading to significant performance gains in table question answering.
Contribution
The paper presents a novel region-based reinforcement learning approach with specialized fine-tuning and policy optimization techniques for enhanced table reasoning in LLMs.
Findings
Achieves an average performance improvement of 14.36 points on benchmark datasets.
Outperforms larger baseline models with ten times the parameters.
Reduces response token consumption by 67.5% with TARPO.
Abstract
Tables present unique challenges for language models due to their structured row-column interactions, necessitating specialized approaches for effective comprehension. While large language models (LLMs) have demonstrated potential in table reasoning through prompting and techniques like chain-of-thought (CoT) and program-of-thought (PoT), optimizing their performance for table question answering remains underexplored. In this paper, we introduce region-based Table-R1, a novel reinforcement learning approach that enhances LLM table understanding by integrating region evidence into reasoning steps. Our method employs Region-Enhanced Supervised Fine-Tuning (RE-SFT) to guide models in identifying relevant table regions before generating answers, incorporating textual, symbolic, and program-based reasoning. Additionally, Table-Aware Group Relative Policy Optimization (TARPO) introduces a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
