Vid2Actor: Free-viewpoint Animatable Person Synthesis from Video in the Wild
Chung-Yi Weng, Brian Curless, Ira Kemelmacher-Shlizerman

TL;DR
This paper introduces Vid2Actor, a method that reconstructs an animatable 3D human model from in-the-wild videos, enabling free-viewpoint and pose synthesis without explicit 3D meshes or pre-rigged models.
Contribution
It presents a volumetric 3D human representation learned directly from video, allowing pose and view synthesis without requiring ground truth meshes or pre-defined rigs.
Findings
Effective on real videos of diverse activities
Enables motion re-targeting and bullet-time rendering
Outperforms GAN-based image translation methods
Abstract
Given an "in-the-wild" video of a person, we reconstruct an animatable model of the person in the video. The output model can be rendered in any body pose to any camera view, via the learned controls, without explicit 3D mesh reconstruction. At the core of our method is a volumetric 3D human representation reconstructed with a deep network trained on input video, enabling novel pose/view synthesis. Our method is an advance over GAN-based image-to-image translation since it allows image synthesis for any pose and camera via the internal 3D representation, while at the same time it does not require a pre-rigged model or ground truth meshes for training, as in mesh-based learning. Experiments validate the design choices and yield results on synthetic data and on real videos of diverse people performing unconstrained activities (e.g. dancing or playing tennis). Finally, we demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Generative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging
