An Empirical Investigation of Robustness in Large Language Models under Tabular Distortions
Avik Dutta, Harshit Nigam, Hosein Hasanbeig, Arjun Radhakrishna, Sumit Gulwani

TL;DR
This paper examines the vulnerability of large language models to distortions in tabular data, revealing their limited ability to detect and correct such distortions without explicit prompts, and introduces a dataset to evaluate this issue.
Contribution
It provides an empirical analysis of LLM robustness to tabular distortions and introduces a curated dataset for systematic evaluation of error correction in table question answering.
Findings
LLMs' accuracy drops by at least 22% under tabular distortions
Models partially correct distortions only with explicit prompts
Even state-of-the-art models like GPT-5.2 are affected by distortions
Abstract
We investigate how large language models (LLMs) fail when tabular data in an otherwise canonical representation is subjected to semantic and structural distortions. Our findings reveal that LLMs lack an inherent ability to detect and correct subtle distortions in table representations. Only when provided with an explicit prior, via a system prompt, do models partially adjust their reasoning strategies and correct some distortions, though not consistently or completely. To study this phenomenon, we introduce a small, expert-curated dataset that explicitly evaluates LLMs on table question answering (TQA) tasks requiring an additional error-correction step prior to analysis. Our results reveal systematic differences in how LLMs ingest and interpret tabular information under distortion, with even SoTA models such as GPT-5.2 model exhibiting a drop of minimum 22% accuracy under distortion.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Computational and Text Analysis Methods
