RePFormer: Refinement Pyramid Transformer for Robust Facial Landmark Detection
Jinpeng Li, Haibo Jin, Shengcai Liao, Ling Shao, Pheng-Ann Heng

TL;DR
RePFormer introduces a pyramid transformer-based approach that refines facial landmark detection by modeling landmark relations and context, achieving superior robustness and accuracy in complex scenarios.
Contribution
The paper proposes a novel Refinement Pyramid Transformer that models landmark relations and context for improved facial landmark detection.
Findings
Outperforms existing methods on four benchmarks
Demonstrates high robustness in complex real-world scenarios
Effective end-to-end landmark refinement process
Abstract
This paper presents a Refinement Pyramid Transformer (RePFormer) for robust facial landmark detection. Most facial landmark detectors focus on learning representative image features. However, these CNN-based feature representations are not robust enough to handle complex real-world scenarios due to ignoring the internal structure of landmarks, as well as the relations between landmarks and context. In this work, we formulate the facial landmark detection task as refining landmark queries along pyramid memories. Specifically, a pyramid transformer head (PTH) is introduced to build both homologous relations among landmarks and heterologous relations between landmarks and cross-scale contexts. Besides, a dynamic landmark refinement (DLR) module is designed to decompose the landmark regression into an end-to-end refinement procedure, where the dynamically aggregated queries are transformed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Cleft Lip and Palate Research · Domain Adaptation and Few-Shot Learning
MethodsAttention Is All You Need · Linear Layer · Softmax · Multi-Head Attention · Residual Connection · Dense Connections · Absolute Position Encodings · Dropout · Byte Pair Encoding · Adam
