FGTBT: Frequency-Guided Task-Balancing Transformer for Unified Facial Landmark Detection
Jun Wan, Xinyu Xiong, Ning Chen, Zhihui Lai, Jie Zhou, Wenwen Min

TL;DR
This paper introduces FGTBT, a novel transformer-based framework for facial landmark detection that leverages frequency-domain modeling and multi-dataset training to improve accuracy in challenging scenarios.
Contribution
It proposes a frequency-guided structure-aware model and a fine-grained multi-task balancing loss for better facial structure perception and unified training across datasets.
Findings
Achieves performance comparable to state-of-the-art methods on benchmark datasets.
Effectively handles large pose, illumination, and expression variations.
Enhances facial structure learning through frequency-guided regularization.
Abstract
Recently, deep learning based facial landmark detection (FLD) methods have achieved considerable success. However, in challenging scenarios such as large pose variations, illumination changes, and facial expression variations, they still struggle to accurately capture the geometric structure of the face, resulting in performance degradation. Moreover, the limited size and diversity of existing FLD datasets hinder robust model training, leading to reduced detection accuracy. To address these challenges, we propose a Frequency-Guided Task-Balancing Transformer (FGTBT), which enhances facial structure perception through frequency-domain modeling and multi-dataset unified training. Specifically, we propose a novel Fine-Grained Multi-Task Balancing loss (FMB-loss), which moves beyond coarse task-level balancing by assigning weights to individual landmarks based on their occurrence across…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Face Recognition and Perception · Emotion and Mood Recognition
