Code over Words: Overcoming Semantic Inertia via Code-Grounded Reasoning
Manjie Xu, Isabella Yin, Xinyi Tu, Chi Zhang, Yixin Zhu

TL;DR
This paper investigates how large language models struggle with overriding pre-trained beliefs when faced with changing rules, and proposes a code-grounded approach that improves their ability to inhibit prior knowledge and reason dynamically.
Contribution
It introduces Code-Grounded Vistas, a fine-tuning method that uses executable code representations to enhance models' capacity to override priors and reason with dynamic rules.
Findings
Larger models can perform worse when suppressing pre-trained associations.
Representing rules as executable code improves prior inhibition.
Code-grounded training outperforms inference-time search methods.
Abstract
LLMs struggle with Semantic Inertia: the inability to inhibit pre-trained priors (e.g., "Lava is Dangerous") when dynamic, in-context rules contradict them. We probe this phenomenon using Baba Is You, where physical laws are mutable text rules, enabling precise evaluation of models' ability to override learned priors when rules change. We quantatively observe that larger models can exhibit inverse scaling: they perform worse than smaller models when natural language reasoning requires suppressing pre-trained associations (e.g., accepting "Lava is Safe"). Our analysis attributes this to natural language encoding, which entangles descriptive semantics and logical rules, leading to persistent hallucinations of familiar physics despite explicit contradictory rules. Here we show that representing dynamics as executable code, rather than descriptive text, reverses this trend and enables…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Language and cultural evolution
