SPARTAN: A Sparse Transformer World Model Attending to What Matters

Anson Lei; Bernhard Sch\"olkopf; Ingmar Posner

arXiv:2411.06890·cs.LG·December 8, 2025

SPARTAN: A Sparse Transformer World Model Attending to What Matters

Anson Lei, Bernhard Sch\"olkopf, Ingmar Posner

PDF

Open Access

TL;DR

SPARTAN is a sparse, Transformer-based world model that learns context-dependent interaction structures between objects, improving interpretability, adaptability, and robustness in dynamic environments.

Contribution

The paper introduces SPARTAN, a novel sparse Transformer world model that effectively captures local causal interactions and adapts to environment changes.

Findings

01

Outperforms state-of-the-art in object-centric world modeling

02

Learns accurate local causal interaction graphs

03

Shows improved few-shot adaptation and robustness

Abstract

Capturing the interactions between entities in a structured way plays a central role in world models that flexibly adapt to changes in the environment. Recent works motivate the benefits of models that explicitly represent the structure of interactions and formulate the problem as discovering local causal structures. In this work, we demonstrate that reliably capturing these relationships in complex settings remains challenging. To remedy this shortcoming, we postulate that sparsity is a critical ingredient for the discovery of such local structures. To this end, we present the SPARse TrANsformer World model (SPARTAN), a Transformer-based world model that learns context-dependent interaction structures between entities in a scene. By applying sparsity regularisation on the attention patterns between object-factored tokens, SPARTAN learns sparse, context-dependent interaction graphs that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Handwritten Text Recognition Techniques · Topic Modeling

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Cosine Annealing · Dense Connections · Layer Normalization · Adam · Attention Dropout · Multi-Head Attention · Residual Connection