PhyCo: Learning Controllable Physical Priors for Generative Motion

Sriram Narayanan; Ziyu Jiang; Srinivasa Narasimhan; Manmohan Chandraker

arXiv:2604.28169·cs.CV·May 1, 2026

PhyCo: Learning Controllable Physical Priors for Generative Motion

Sriram Narayanan, Ziyu Jiang, Srinivasa Narasimhan, Manmohan Chandraker

PDF

1 Datasets

TL;DR

PhyCo is a novel framework that enhances video diffusion models with controllable, physically consistent generation by integrating a large simulation dataset, physics-supervised fine-tuning, and vision-language guided optimization.

Contribution

It introduces a scalable method combining simulation data, physics supervision, and visual language feedback to produce physically realistic and controllable videos without requiring geometry reconstruction.

Findings

01

PhyCo outperforms baselines on the Physics-IQ benchmark.

02

Human studies show improved physical realism and control.

03

The approach generalizes beyond synthetic training environments.

Abstract

Modern video diffusion models excel at appearance synthesis but still struggle with physical consistency: objects drift, collisions lack realistic rebound, and material responses seldom match their underlying properties. We present PhyCo, a framework that introduces continuous, interpretable, and physically grounded control into video generation. Our approach integrates three key components: (i) a large-scale dataset of over 100K photorealistic simulation videos where friction, restitution, deformation, and force are systematically varied across diverse scenarios; (ii) physics-supervised fine-tuning of a pretrained diffusion model using a ControlNet conditioned on pixel-aligned physical property maps; and (iii) VLM-guided reward optimization, where a fine-tuned vision-language model evaluates generated videos with targeted physics queries and provides differentiable feedback. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

nnsriram97/phyco_kubric
dataset· 216 dl
216 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.