ChronoDreamer: Action-Conditioned World Model as an Online Simulator for Robotic Planning

Zhenhao Zhou; Dan Negrut

arXiv:2512.18619·cs.AI·December 23, 2025

ChronoDreamer: Action-Conditioned World Model as an Online Simulator for Robotic Planning

Zhenhao Zhou, Dan Negrut

PDF

Open Access

TL;DR

ChronoDreamer is an action-conditioned world model that predicts future states in robotic manipulation, enabling safe planning by evaluating collision likelihood with a vision-language model.

Contribution

It introduces a novel spatial-temporal transformer-based world model with contact encoding and an LLM-based collision evaluator for robotic planning.

Findings

01

Accurately predicts contact-rich interactions in simulation.

02

Effectively distinguishes safe and unsafe trajectories.

03

Preserves spatial coherence during motion.

Abstract

We present ChronoDreamer, an action-conditioned world model for contact-rich robotic manipulation. Given a history of egocentric RGB frames, contact maps, actions, and joint states, ChronoDreamer predicts future video frames, contact distributions, and joint angles via a spatial-temporal transformer trained with MaskGIT-style masked prediction. Contact is encoded as depth-weighted Gaussian splat images that render 3D forces into a camera-aligned format suitable for vision backbones. At inference, predicted rollouts are evaluated by a vision-language model that reasons about collision likelihood, enabling rejection sampling of unsafe actions before execution. We train and evaluate on DreamerBench, a simulation dataset generated with Project Chrono that provides synchronized RGB, contact splat, proprioception, and physics annotations across rigid and deformable object scenarios.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Social Robot Interaction and HRI · Human Pose and Action Recognition