Orthogonal Hierarchical Decomposition for Structure-Aware Table Understanding with Large Language Models
Bin Cao, Huixian Lu, Chenwen Ma, Ting Wang, Ruizhe Li, Jing Fan

TL;DR
This paper introduces Orthogonal Hierarchical Decomposition (OHD), a novel method for representing complex tables with hierarchical structures to improve understanding and reasoning by large language models, outperforming existing approaches.
Contribution
The paper proposes the OHD framework with Orthogonal Tree Induction and a dual-pathway association protocol to explicitly capture hierarchical dependencies in complex tables for LLMs.
Findings
OHD outperforms existing methods on AITQA and HiTab benchmarks.
The orthogonal tree decomposition effectively captures hierarchical table structures.
Semantic alignment improves question answering accuracy.
Abstract
Complex tables with multi-level headers, merged cells and heterogeneous layouts pose persistent challenges for LLMs in both understanding and reasoning. Existing approaches typically rely on table linearization or normalized grid modeling. However, these representations struggle to explicitly capture hierarchical structures and cross-dimensional dependencies, which can lead to misalignment between structural semantics and textual representations for non-standard tables. To address this issue, we propose an Orthogonal Hierarchical Decomposition (OHD) framework that constructs structure-preserving input representations of complex tables for LLMs. OHD introduces an Orthogonal Tree Induction (OTI) method based on spatial--semantic co-constraints, which decomposes irregular tables into a column tree and a row tree to capture vertical and horizontal hierarchical dependencies, respectively.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Data Visualization and Analytics · Machine Learning in Healthcare
