BodyNet: Volumetric Inference of 3D Human Body Shapes
G\"ul Varol, Duygu Ceylan, Bryan Russell, Jimei Yang, Ersin Yumer,, Ivan Laptev, Cordelia Schmid

TL;DR
BodyNet is a neural network that directly infers 3D human body shapes from a single image using volumetric and multi-view losses, achieving state-of-the-art results and enabling body-part segmentation.
Contribution
It introduces an end-to-end trainable volumetric inference method for 3D human shape estimation from images, bypassing traditional parametric models.
Findings
Achieves state-of-the-art results on SURREAL and Unite the People datasets.
Enables volumetric body-part segmentation.
Outperforms recent approaches in 3D human shape estimation.
Abstract
Human shape estimation is an important task for video editing, animation and fashion industry. Predicting 3D human body shape from natural images, however, is highly challenging due to factors such as variation in human bodies, clothing and viewpoint. Prior methods addressing this problem typically attempt to fit parametric body models with certain priors on pose and shape. In this work we argue for an alternative representation and propose BodyNet, a neural network for direct inference of volumetric body shape from a single image. BodyNet is an end-to-end trainable network that benefits from (i) a volumetric 3D loss, (ii) a multi-view re-projection loss, and (iii) intermediate supervision of 2D pose, 2D body part segmentation, and 3D pose. Each of them results in performance improvement as demonstrated by our experiments. To evaluate the method, we fit the SMPL model to our network…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · 3D Shape Modeling and Analysis · Advanced Vision and Imaging
