Unsupervised Volumetric Animation
Aliaksandr Siarohin, Willi Menapace, Ivan Skorokhodov, Kyle Olszewski,, Jian Ren, Hsin-Ying Lee, Menglei Chai, Sergey Tulyakov

TL;DR
This paper introduces an unsupervised method for 3D animation of deformable objects from single-view videos, enabling segmentation, keypoint estimation, and novel view synthesis without labeled data.
Contribution
It presents a novel 3D autodecoder framework combined with a differentiable PnP for unsupervised learning of object geometry and parts from videos and images.
Findings
Effective 3D segmentation and keypoint estimation from videos.
Capable of novel view synthesis and animation.
Learns 3D geometry from still images.
Abstract
We propose a novel approach for unsupervised 3D animation of non-rigid deformable objects. Our method learns the 3D structure and dynamics of objects solely from single-view RGB videos, and can decompose them into semantically meaningful parts that can be tracked and animated. Using a 3D autodecoder framework, paired with a keypoint estimator via a differentiable PnP algorithm, our model learns the underlying object geometry and parts decomposition in an entirely unsupervised manner. This allows it to perform 3D segmentation, 3D keypoint estimation, novel view synthesis, and animation. We primarily evaluate the framework on two video datasets: VoxCeleb and TEDXPeople . In addition, on the Cats image dataset, we show it even learns compelling 3D geometry from still images. Finally, we show our model can obtain animatable 3D objects from a single or few images. Code…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · 3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis
MethodsPnP
