Beyond Surface Structure: A Causal Assessment of LLMs' Comprehension Ability
Yujin Han, Lei Xu, Sirui Chen, Difan Zou, Chaochao Lu

TL;DR
This paper introduces a causal framework to evaluate whether large language models truly understand deep semantics or rely on surface cues, revealing that most models do possess some deep comprehension which varies with model type and size.
Contribution
The paper develops a novel causal mediation analysis approach with approximated direct and indirect effects to assess LLMs' deep and surface structure understanding, providing a more comprehensive evaluation method.
Findings
Most mainstream LLMs show deep structure comprehension that improves with accuracy.
Closed-source LLMs rely more on deep structure, while open-source models are more surface-sensitive.
Deep structure comprehension increases with model scale.
Abstract
Large language models (LLMs) have shown remarkable capability in natural language tasks, yet debate persists on whether they truly comprehend deep structure (i.e., core semantics) or merely rely on surface structure (e.g., presentation format). Prior studies observe that LLMs' performance declines when intervening on surface structure, arguing their success relies on surface structure recognition. However, surface structure sensitivity does not prevent deep structure comprehension. Rigorously evaluating LLMs' capability requires analyzing both, yet deep structure is often overlooked. To this end, we assess LLMs' comprehension ability using causal mediation analysis, aiming to fully discover the capability of using both deep and surface structures. Specifically, we formulate the comprehension of deep structure as direct causal effect (DCE) and that of surface structure as indirect causal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLegal Education and Practice Innovations · Artificial Intelligence in Law
