Playable Environments: Video Manipulation in Space and Time
Willi Menapace, St\'ephane Lathuili\`ere, Aliaksandr Siarohin,, Christian Theobalt, Sergey Tulyakov, Vladislav Golyanik, Elisa Ricci

TL;DR
This paper introduces Playable Environments, a novel framework for interactive video generation and manipulation in space and time, enabling users to control objects and viewpoints in 3D with a single image.
Contribution
It presents a new environment representation that allows real-time manipulation of objects and camera viewpoints in generated videos, using unsupervised learning and volumetric rendering techniques.
Findings
Enables interactive 3D video manipulation from a single image
Supports diverse object appearances with style-based modulation
Introduces large-scale datasets with significant camera movements
Abstract
We present Playable Environments - a new representation for interactive video generation and manipulation in space and time. With a single image at inference time, our novel framework allows the user to move objects in 3D while generating a video by providing a sequence of desired actions. The actions are learnt in an unsupervised manner. The camera can be controlled to get the desired viewpoint. Our method builds an environment state for each frame, which can be manipulated by our proposed action module and decoded back to the image space with volumetric rendering. To support diverse appearances of objects, we extend neural radiance fields with style-based modulation. Our method trains on a collection of various monocular videos requiring only the estimated camera parameters and 2D object locations. To set a challenging benchmark, we introduce two large scale video datasets with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques
