TL;DR
This paper presents an automatic method for 3D dog pose and shape reconstruction from monocular images, leveraging a new shape prior and expectation maximization, with a new dataset and model release.
Contribution
It introduces a novel end-to-end approach using EM to learn detailed 3D dog shape priors from in-the-wild images, along with a new dataset and model.
Findings
Successfully reconstructs 3D dog shapes from internet images
Learns a detailed 3D shape prior using EM and 2D annotations
Releases a new dataset and a parameterized 3D dog model
Abstract
We introduce an automatic, end-to-end method for recovering the 3D pose and shape of dogs from monocular internet images. The large variation in shape between dog breeds, significant occlusion and low quality of internet images makes this a challenging problem. We learn a richer prior over shapes than previous work, which helps regularize parameter estimation. We demonstrate results on the Stanford Dog dataset, an 'in the wild' dataset of 20,580 dog images for which we have collected 2D joint and silhouette annotations to split for training and evaluation. In order to capture the large shape variety of dogs, we show that the natural variation in the 2D dataset is enough to learn a detailed 3D prior through expectation maximization (EM). As a by-product of training, we generate a new parameterized model (including limb scaling) SMBLD which we release alongside our new annotation dataset…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
