ORFormer: Occlusion-Robust Transformer for Accurate Facial Landmark   Detection

Jui-Che Chiang; Hou-Ning Hu; Bo-Syuan Hou; Chia-Yu Tseng; Yu-Lun Liu,; Min-Hung Chen; Yen-Yu Lin

arXiv:2412.13174·cs.CV·January 15, 2025

ORFormer: Occlusion-Robust Transformer for Accurate Facial Landmark Detection

Jui-Che Chiang, Hou-Ning Hu, Bo-Syuan Hou, Chia-Yu Tseng, Yu-Lun Liu,, Min-Hung Chen, Yen-Yu Lin

PDF

Open Access

TL;DR

ORFormer is a transformer-based facial landmark detection method that effectively identifies and recovers occluded facial regions, resulting in more accurate landmark localization under challenging conditions.

Contribution

The paper introduces ORFormer, a novel transformer architecture with messenger tokens that detect and recover occluded facial regions, improving landmark detection robustness.

Findings

01

Outperforms state-of-the-art on WFLW and COFW datasets

02

Produces heatmaps resilient to partial occlusions

03

Enhances existing FLD methods with recovered features

Abstract

Although facial landmark detection (FLD) has gained significant progress, existing FLD methods still suffer from performance drops on partially non-visible faces, such as faces with occlusions or under extreme lighting conditions or poses. To address this issue, we introduce ORFormer, a novel transformer-based method that can detect non-visible regions and recover their missing features from visible parts. Specifically, ORFormer associates each image patch token with one additional learnable token called the messenger token. The messenger token aggregates features from all but its patch. This way, the consensus between a patch and other patches can be assessed by referring to the similarity between its regular and messenger embeddings, enabling non-visible region identification. Our method then recovers occluded patches with features aggregated by the messenger tokens. Leveraging the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Face and Expression Recognition