Coarse-to-fine Animal Pose and Shape Estimation
Chen Li, Gim Hee Lee

TL;DR
This paper introduces a coarse-to-fine method combining parametric SMAL models and graph convolutional networks to improve 3D animal mesh reconstruction from single images, addressing limitations of existing approaches.
Contribution
It proposes a novel coarse-to-fine framework with a hierarchical GCN for mesh refinement, integrating local and global features for better shape detail capture.
Findings
Achieves state-of-the-art results on StanfordExtra dataset.
Demonstrates good generalization on Animal Pose and BADJA datasets.
Outperforms existing methods in mesh accuracy and detail.
Abstract
Most existing animal pose and shape estimation approaches reconstruct animal meshes with a parametric SMAL model. This is because the low-dimensional pose and shape parameters of the SMAL model makes it easier for deep networks to learn the high-dimensional animal meshes. However, the SMAL model is learned from scans of toy animals with limited pose and shape variations, and thus may not be able to represent highly varying real animals well. This may result in poor fittings of the estimated meshes to the 2D evidences, e.g. 2D keypoints or silhouettes. To mitigate this problem, we propose a coarse-to-fine approach to reconstruct 3D animal mesh from a single image. The coarse estimation stage first estimates the pose, shape and translation parameters of the SMAL model. The estimated meshes are then used as a starting point by a graph convolutional network (GCN) to predict a per-vertex…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsHuman Pose and Action Recognition · 3D Shape Modeling and Analysis · Advanced Vision and Imaging
MethodsGraph Convolutional Network
