An Empirical Investigation of Robustness in Large Language Models under Tabular Distortions

Avik Dutta; Harshit Nigam; Hosein Hasanbeig; Arjun Radhakrishna; Sumit Gulwani

arXiv:2601.05009·cs.AI·January 9, 2026

An Empirical Investigation of Robustness in Large Language Models under Tabular Distortions

Avik Dutta, Harshit Nigam, Hosein Hasanbeig, Arjun Radhakrishna, Sumit Gulwani

PDF

Open Access

TL;DR

This paper examines the vulnerability of large language models to distortions in tabular data, revealing their limited ability to detect and correct such distortions without explicit prompts, and introduces a dataset to evaluate this issue.

Contribution

It provides an empirical analysis of LLM robustness to tabular distortions and introduces a curated dataset for systematic evaluation of error correction in table question answering.

Findings

01

LLMs' accuracy drops by at least 22% under tabular distortions

02

Models partially correct distortions only with explicit prompts

03

Even state-of-the-art models like GPT-5.2 are affected by distortions

Abstract

We investigate how large language models (LLMs) fail when tabular data in an otherwise canonical representation is subjected to semantic and structural distortions. Our findings reveal that LLMs lack an inherent ability to detect and correct subtle distortions in table representations. Only when provided with an explicit prior, via a system prompt, do models partially adjust their reasoning strategies and correct some distortions, though not consistently or completely. To study this phenomenon, we introduce a small, expert-curated dataset that explicitly evaluates LLMs on table question answering (TQA) tasks requiring an additional error-correction step prior to analysis. Our results reveal systematic differences in how LLMs ingest and interpret tabular information under distortion, with even SoTA models such as GPT-5.2 model exhibiting a drop of minimum 22% accuracy under distortion.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Computational and Text Analysis Methods