How Do Language Models Understand Tables? A Mechanistic Analysis of Cell Location
Xuanliang Zhang, Dingzirui Wang, Keyan Xu, Qingfu Zhu, Wanxiang Che

TL;DR
This paper investigates how large language models understand tables by dissecting cell location, revealing a three-stage process involving semantic binding, coordinate localization, and information extraction, with insights into model mechanisms and generalization.
Contribution
The study provides a detailed mechanistic analysis of table understanding in LLMs, introducing a three-stage pipeline and revealing how models encode and generalize cell location tasks.
Findings
Models locate cells via ordinal counting of delimiters.
Column indices are encoded in a linear subspace enabling vector arithmetic.
Models generalize to multi-cell tasks by multiplexing attention heads.
Abstract
While Large Language Models (LLMs) are increasingly deployed for table-related tasks, the internal mechanisms enabling them to process linearized two-dimensional structured tables remain opaque. In this work, we investigate the process of table understanding by dissecting the atomic task of cell location. Through activation patching and complementary interpretability techniques, we delineate the table understanding mechanism into a sequential three-stage pipeline: Semantic Binding, Coordinate Localization, and Information Extraction. We demonstrate that models locate the target cell via an ordinal mechanism that counts discrete delimiters to resolve coordinates. Furthermore, column indices are encoded within a linear subspace that allows for precise steering of model focus through vector arithmetic. Finally, we reveal that models generalize to multi-cell location tasks by multiplexing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCell Image Analysis Techniques · Ferroelectric and Negative Capacitance Devices · Topic Modeling
