TL;DR
TelePhysics is a real-time, physics-grounded scene generation framework that converts a single image into a controllable, physically consistent video with multi-object interactions, surpassing prior methods in fidelity and coherence.
Contribution
It introduces a training-free, scene-level 3D reconstruction approach enabling realistic, controllable video synthesis from a single image with improved physical consistency.
Findings
Outperforms prior methods in physical fidelity and spatial coherence.
Enables real-time interaction with complex multi-object scenes.
Provides richer control types for mechanics-based manipulation.
Abstract
Recent generative video models achieve impressive visual quality but remain constrained by limited physical consistency and controllability. Existing video generation methods provide minimal physical control, and single-image-to-3D conversion approaches often suffer from object interpenetration. Furthermore, physics-based scene-level 3D generation methods exhibit spatial misalignment, stylized artifacts, and inconsistencies with the input data, restricting their use in realistic interactive video synthesis. We propose TelePhysics, a training-free framework that converts a single image into a physically consistent and controllable video through holistic scene-level 3D reconstruction. By representing the full scene geometry in a unified spatial coordinate system, TelePhysics resolves object penetration and alignment ambiguity. Unlike prior methods, this formulation enables accurate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
