Convergent World Representations and Divergent Tasks
Core Francisco Park

TL;DR
This paper investigates how neural network representations of a controlled world evolve with different tasks, revealing that multi-task training promotes convergence while some tasks hinder adaptation and generalization.
Contribution
It introduces a framework to study world representations, demonstrating how multi-task learning leads to convergent geometries and identifying divergent tasks that impair adaptation.
Findings
Multi-task training aligns world representations across tasks.
Divergent tasks can harm new entity integration.
Convergent representations facilitate better generalization.
Abstract
While neural representations are central to modern deep learning, the conditions governing their geometry and their roles in downstream adaptability remain poorly understood. We develop a framework clearly separating the underlying world, the data generation process and the resulting model representations to study these questions in a controlled setup. 5,075 city coordinates define the world and 7 geometric tasks generate the training data for autoregressive training. We find that different tasks give rise to qualitatively and quantitatively distinct world representation geometries. However, multi-task training drives convergence of world representations: models trained on non-overlapping tasks develop aligned geometric representations, providing controlled evidence for the Multitask Scaling Hypothesis of the Platonic Representation Hypothesis. To study adaptation, we pretrain models on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Face Recognition and Perception · Domain Adaptation and Few-Shot Learning
