TL;DR
This paper introduces a real-time 6DoF face pose estimation method that does not rely on face detection or landmarks, outperforming existing approaches in accuracy and efficiency.
Contribution
The authors present a novel Faster R-CNN-based model for direct 6DoF face pose regression, eliminating the need for face detection and landmark localization.
Findings
Runs in real-time with high accuracy
Outperforms state-of-the-art face pose estimators
Surpasses models of similar complexity on face detection benchmarks
Abstract
We propose real-time, six degrees of freedom (6DoF), 3D face pose estimation without face detection or landmark localization. We observe that estimating the 6DoF rigid transformation of a face is a simpler problem than facial landmark detection, often used for 3D face alignment. In addition, 6DoF offers more information than face bounding box labels. We leverage these observations to make multiple contributions: (a) We describe an easily trained, efficient, Faster R-CNN--based model which regresses 6DoF pose for all faces in the photo, without preliminary face detection. (b) We explain how pose is converted and kept consistent between the input photo and arbitrary crops created while training and evaluating our model. (c) Finally, we show how face poses can replace detection bounding box training labels. Tests on AFLW2000-3D and BIWI show that our method runs at real-time and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
