When Models Ignore Definitions: Measuring Semantic Override Hallucinations in LLM Reasoning
Yogeswar Reddy Thota, Setareh Rafatirad, Homayoun Houman, Tooraj Nikoubin

TL;DR
This paper investigates how large language models often fail to correctly interpret locally redefined semantics in logic and circuit reasoning tasks, revealing a gap between their surface correctness and true understanding.
Contribution
The study introduces a benchmark for testing semantic override in LLMs and systematically analyzes their failures in local semantic redefinition scenarios.
Findings
LLMs frequently ignore local redefinitions in logic tasks
Models often make confident but incompatible assumptions
LLMs drop constraints even in simple formal reasoning settings
Abstract
Large language models (LLMs) demonstrate strong performance on standard digital logic and Boolean reasoning tasks, yet their reliability under locally redefined semantics remains poorly understood. In many formal settings, such as circuit specifications, examinations, and hardware documentation, operators and components are explicitly redefined within narrow scope. Correct reasoning in these contexts requires models to temporarily suppress globally learned conventions in favor of prompt-local definitions. In this work, we study a systematic failure mode we term semantic override, in which an LLM reverts to its pretrained default interpretation of operators or gate behavior despite explicit redefinition in the prompt. We also identify a related class of errors, assumption injection, where models commit to unstated hardware semantics when critical details are underspecified, rather than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Physical Unclonable Functions (PUFs) and Hardware Security · Formal Methods in Verification
