The Mouth is Not the Brain: Bridging Energy-Based World Models and Language Generation

Junichiro Niimi

arXiv:2601.17094·cs.LG·April 1, 2026

The Mouth is Not the Brain: Bridging Energy-Based World Models and Language Generation

Junichiro Niimi

PDF

TL;DR

This paper introduces a modular architecture that separates world modeling from language generation, demonstrating improved controllability and coherence in text output by connecting a domain-specific energy-based world model with a frozen language model.

Contribution

It proposes a novel framework that explicitly decouples world understanding from language modeling, enabling better control and coherence in generated text.

Findings

01

World model conditioning reduces cross-entropy and increases semantic similarity.

02

Energy function effectively distinguishes plausible from implausible configurations.

03

Causal interventions on attributes influence generated text in a statistically consistent manner.

Abstract

Large Language Models (LLMs) generate fluent text, yet whether they truly understand the world or merely produce plausible texts about it remains contested. We propose an architectural principle, the mouth is not the brain, that explicitly separates world models from language models. Our architecture comprises three components: a DBM that captures domain structure as an energy-based world model, an adapter that projects latent belief states into embedding space, and a frozen GPT-2 that provides linguistic competence without domain knowledge. We instantiate this framework in the consumer review domain using Amazon smartphone reviews. Experiments demonstrate that (1) world model conditioning achieves lower cross-entropy loss and higher semantic similarity than architectural baselines including direct projection and full fine-tuning, while qualitative analysis reveals that soft prompt…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.