InstructTable: Improving Table Structure Recognition Through Instructions
Boming Chen, Zining Wang, Zhentao Guo, Jianqiang Liu, Chen Duan, Yu Gu, Kai zhou, Pengfei Yan

TL;DR
InstructTable introduces an instruction-guided training framework for table structure recognition, combining semantic instructions and synthetic data to improve accuracy on complex tables.
Contribution
The paper proposes a novel instruction-guided multi-stage training framework and a synthetic data generation method for enhanced TSR performance.
Findings
InstructTable achieves state-of-the-art results on multiple datasets.
The synthetic BCDSTab benchmark effectively evaluates complex table recognition.
Instruction-guided training improves understanding of complex table structures.
Abstract
Table structure recognition (TSR) holds widespread practical importance by parsing tabular images into structured representations, yet encounters significant challenges when processing complex layouts involving merged or empty cells. Traditional visual-centric models rely exclusively on visual information while lacking crucial semantic support, thereby impeding accurate structural recognition in complex scenarios. Vision-language models leverage contextual semantics to enhance comprehension; however, these approaches underemphasize the modeling of visual structural information. To address these limitations, this paper introduces InstructTable, an instruction-guided multi-stage training TSR framework. Meticulously designed table instruction pre-training directs attention toward fine-grained structural patterns, enhancing comprehension of complex tables. Complementary TSR fine-tuning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
