VolNet: Estimating Human Body Part Volumes from a Single RGB Image

Fabian Leinen; Vittorio Cozzolino; Torsten Sch\"on

arXiv:2107.02259·cs.CV·July 7, 2021

VolNet: Estimating Human Body Part Volumes from a Single RGB Image

Fabian Leinen, Vittorio Cozzolino, Torsten Sch\"on

PDF

Open Access

TL;DR

VolNet is a novel deep learning architecture that estimates human body volume from a single RGB image by combining pose estimation, segmentation, and volume regression, significantly outperforming previous methods.

Contribution

The paper introduces VolNet, a new model leveraging 2D/3D pose, segmentation, and volume regression, along with a synthetic dataset, to improve body volume estimation accuracy.

Findings

01

Correctly predicts volume in ~82% of cases within 10% tolerance

02

Outperforms state-of-the-art solutions like BodyNet

03

Uses a large-scale synthetic dataset SURREALvols

Abstract

Human body volume estimation from a single RGB image is a challenging problem despite minimal attention from the research community. However VolNet, an architecture leveraging 2D and 3D pose estimation, body part segmentation and volume regression extracted from a single 2D RGB image combined with the subject's body height can be used to estimate the total body volume. VolNet is designed to predict the 2D and 3D pose as well as the body part segmentation in intermediate tasks. We generated a synthetic, large-scale dataset of photo-realistic images of human bodies with a wide range of body shapes and realistic poses called SURREALvols. By using Volnet and combining multiple stacked hourglass networks together with ResNeXt, our model correctly predicted the volume in ~82% of cases with a 10% tolerance threshold. This is a considerable improvement compared to state-of-the-art solutions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · 3D Shape Modeling and Analysis · Video Surveillance and Tracking Methods

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · Residual Connection · Average Pooling · Grouped Convolution · Global Average Pooling · Kaiming Initialization · 1x1 Convolution · ResNeXt Block · Hourglass Module