Zero-shot World Models Are Developmentally Efficient Learners

Khai Loong Aw; Klemen Kotar; Wanhee Lee; Seungwoo Kim; Khaled Jedoui; Rahul Venkatesh; Lilian Naing Chen; Michael C. Frank; Daniel L.K. Yamins

arXiv:2604.10333·cs.AI·April 14, 2026

Zero-shot World Models Are Developmentally Efficient Learners

Khai Loong Aw, Klemen Kotar, Wanhee Lee, Seungwoo Kim, Khaled Jedoui, Rahul Venkatesh, Lilian Naing Chen, Michael C. Frank, Daniel L.K. Yamins

PDF

1 Repo 8 Models

TL;DR

This paper introduces the Zero-shot Visual World Model (ZWM), a computational framework inspired by child development that learns physical scene understanding efficiently from minimal data and generalizes to multiple tasks.

Contribution

The paper proposes ZWM, a novel model based on principles of decoupled appearance and dynamics, causal inference, and compositional reasoning, mimicking early child development.

Findings

01

ZWM learns from a single child's experience to perform multiple physical understanding tasks.

02

ZWM recapitulates behavioral signatures of child development.

03

ZWM builds brain-like internal representations.

Abstract

Young children demonstrate early abilities to understand their physical world, estimating depth, motion, object coherence, interactions, and many other aspects of physical scene understanding. Children are both data-efficient and flexible cognitive systems, creating competence despite extremely limited training data, while generalizing to myriad untrained tasks -- a major challenge even for today's best AI systems. Here we introduce a novel computational hypothesis for these abilities, the Zero-shot Visual World Model (ZWM). ZWM is based on three principles: a sparse temporally-factored predictor that decouples appearance from dynamics; zero-shot estimation through approximate causal inference; and composition of inferences to build more complex abilities. We show that ZWM can be learned from the first-person experience of a single child, rapidly generating competence across multiple…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

awwkl/ZWM
github

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.