Cognitively Inspired Energy-Based World Models
Alexi Gladstone, Ganesh Nanduru, Md Mofijul Islam, Aman Chadha,, Jundong Li, Tariq Iqbal

TL;DR
This paper introduces Energy-Based World Models (EBWM) inspired by human cognition, enabling models to evaluate prediction plausibility and adaptively allocate processing time, improving scalability and reasoning in AI systems.
Contribution
The paper presents a novel energy-based modeling approach and a specialized transformer architecture that incorporate human-like prediction evaluation and adaptive reasoning capabilities.
Findings
EBWM scales better with data and GPU resources than traditional models in computer vision.
Early results show promising scaling in NLP tasks.
EBWM enables models to assess the plausibility of future states effectively.
Abstract
One of the predominant methods for training world models is autoregressive prediction in the output space of the next element of a sequence. In Natural Language Processing (NLP), this takes the form of Large Language Models (LLMs) predicting the next token; in Computer Vision (CV), this takes the form of autoregressive models predicting the next frame/token/pixel. However, this approach differs from human cognition in several respects. First, human predictions about the future actively influence internal cognitive processes. Second, humans naturally evaluate the plausibility of predictions regarding future states. Based on this capability, and third, by assessing when predictions are sufficient, humans allocate a dynamic amount of time to make a prediction. This adaptive process is analogous to System 2 thinking in psychology. All these capabilities are fundamental to the success of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCognitive Science and Mapping · Cognitive Science and Education Research
MethodsResidual Connection · Softmax · Layer Normalization · Byte Pair Encoding · Label Smoothing · Adam · Attention Is All You Need · Linear Layer · Multi-Head Attention · Position-Wise Feed-Forward Layer
