Structured World Representations in Maze-Solving Transformers

Michael Igorevich Ivanitskiy; Alex F. Spies; Tilman R\"auker,; Guillaume Corlouer; Chris Mathwin; Lucia Quirke; Can Rager; Rusheb Shah; Dan; Valentine; Cecilia Diniz Behn; Katsumi Inoue; Samy Wu Fung

arXiv:2312.02566·cs.LG·December 6, 2023·2 cites

Structured World Representations in Maze-Solving Transformers

Michael Igorevich Ivanitskiy, Alex F. Spies, Tilman R\"auker,, Guillaume Corlouer, Chris Mathwin, Lucia Quirke, Can Rager, Rusheb Shah, Dan, Valentine, Cecilia Diniz Behn, Katsumi Inoue, Samy Wu Fung

PDF

Open Access 1 Repo

TL;DR

This paper investigates small transformer models solving mazes, revealing structured internal representations of maze topology, path information, and attention mechanisms, which enhances understanding of how transformers process spatial tasks.

Contribution

It demonstrates that maze topology and paths are internally represented in a structured manner within small transformers, including linearly decodable maze reconstructions and specialized attention heads.

Findings

01

Residual stream encodes entire maze topology

02

Token embeddings exhibit spatial structure

03

Attention heads are involved in path-following

Abstract

Transformer models underpin many recent advances in practical machine learning applications, yet understanding their internal behavior continues to elude researchers. Given the size and complexity of these models, forming a comprehensive picture of their inner workings remains a significant challenge. To this end, we set out to understand small transformer models in a more tractable setting: that of solving mazes. In this work, we focus on the abstractions formed by these models and find evidence for the consistent emergence of structured internal representations of maze topology and valid paths. We demonstrate this by showing that the residual stream of only a single token can be linearly decoded to faithfully reconstruct the entire maze. We also find that the learned embeddings of individual tokens have spatial structure. Furthermore, we take steps towards deciphering the circuity of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

understanding-search/structured-representations-maze-transformers
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Memory and Neural Computing · Neural dynamics and brain function · Cell Image Analysis Techniques

MethodsSparse Evolutionary Training · Focus