Infinite 3D Landmarks: Improving Continuous 2D Facial Landmark Detection
Prashanth Chandran, Gaspard Zoss, Paulo Gotardo, Derek Bradley

TL;DR
This paper introduces architectural enhancements for 2D facial landmark detection, including unsupervised face normalization, 3D output modeling, and semantic correction, leading to improved accuracy and stability.
Contribution
It proposes novel methods for face normalization, 3D landmark inference, and dataset inconsistency correction, advancing the state-of-the-art in facial landmark detection.
Findings
Improved landmark detection accuracy on standard benchmarks.
Enhanced temporal stability of landmark predictions.
Effective handling of dataset annotation inconsistencies.
Abstract
In this paper, we examine 3 important issues in the practical use of state-of-the-art facial landmark detectors and show how a combination of specific architectural modifications can directly improve their accuracy and temporal stability. First, many facial landmark detectors require face normalization as a preprocessing step, which is accomplished by a separately-trained neural network that crops and resizes the face in the input image. There is no guarantee that this pre-trained network performs the optimal face normalization for landmark detection. We instead analyze the use of a spatial transformer network that is trained alongside the landmark detector in an unsupervised manner, and jointly learn optimal face normalization and landmark detection. Second, we show that modifying the output head of the landmark predictor to infer landmarks in a canonical 3D space can further improve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSpatial Transformer
