Joint Voxel and Coordinate Regression for Accurate 3D Facial Landmark Localization
Hongwen Zhang, Qi Li, Zhenan Sun

TL;DR
This paper introduces JVCR, an end-to-end method combining voxel and coordinate regression for precise 3D facial landmark localization, outperforming existing approaches on benchmark datasets.
Contribution
The paper proposes a novel joint voxel and coordinate regression framework that effectively encodes 3D landmarks and learns structural constraints end-to-end.
Findings
Achieves state-of-the-art accuracy on 3DFAW and AFLW2000-3D datasets.
Utilizes a compact volumetric representation to avoid curse of dimensionality.
Employs a stacked hourglass network for multi-scale volumetric estimation.
Abstract
3D face shape is more expressive and viewpoint-consistent than its 2D counterpart. However, 3D facial landmark localization in a single image is challenging due to the ambiguous nature of landmarks under 3D perspective. Existing approaches typically adopt a suboptimal two-step strategy, performing 2D landmark localization followed by depth estimation. In this paper, we propose the Joint Voxel and Coordinate Regression (JVCR) method for 3D facial landmark localization, addressing it more effectively in an end-to-end fashion. First, a compact volumetric representation is proposed to encode the per-voxel likelihood of positions being the 3D landmarks. The dimensionality of such a representation is fixed regardless of the number of target landmarks, so that the curse of dimensionality could be avoided. Then, a stacked hourglass network is adopted to estimate the volumetric representation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · 3D Shape Modeling and Analysis · Face and Expression Recognition
Methods3D Convolution · Convolution
