A Neural Network for Detailed Human Depth Estimation from a Single Image
Sicong Tang, Feitong Tan, Kelvin Cheng, Zhaoyang Li, Siyu Zhu, Ping, Tan

TL;DR
This paper introduces a neural network that estimates detailed human depth maps from a single RGB image, capturing fine geometry like cloth wrinkles for visualization purposes.
Contribution
It proposes a novel two-branch network architecture with a specialized training strategy and a new fusion layer to enhance depth detail estimation from monocular images.
Findings
Accurately captures cloth wrinkles and detailed geometry.
Outperforms existing methods on real-world images.
Provides publicly available code for reproducibility.
Abstract
This paper presents a neural network to estimate a detailed depth map of the foreground human in a single RGB image. The result captures geometry details such as cloth wrinkles, which are important in visualization applications. To achieve this goal, we separate the depth map into a smooth base shape and a residual detail shape and design a network with two branches to regress them respectively. We design a training strategy to ensure both base and detail shapes can be faithfully learned by the corresponding network branches. Furthermore, we introduce a novel network layer to fuse a rough depth map and surface normals to further improve the final result. Quantitative comparison with fused `ground truth' captured by real depth cameras and qualitative examples on unconstrained Internet images demonstrate the strength of the proposed method. The code is available at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Human Pose and Action Recognition · Video Surveillance and Tracking Methods
