FGTBT: Frequency-Guided Task-Balancing Transformer for Unified Facial Landmark Detection

Jun Wan; Xinyu Xiong; Ning Chen; Zhihui Lai; Jie Zhou; Wenwen Min

arXiv:2601.12863·cs.CV·January 21, 2026

FGTBT: Frequency-Guided Task-Balancing Transformer for Unified Facial Landmark Detection

Jun Wan, Xinyu Xiong, Ning Chen, Zhihui Lai, Jie Zhou, Wenwen Min

PDF

Open Access 1 Models

TL;DR

This paper introduces FGTBT, a novel transformer-based framework for facial landmark detection that leverages frequency-domain modeling and multi-dataset training to improve accuracy in challenging scenarios.

Contribution

It proposes a frequency-guided structure-aware model and a fine-grained multi-task balancing loss for better facial structure perception and unified training across datasets.

Findings

01

Achieves performance comparable to state-of-the-art methods on benchmark datasets.

02

Effectively handles large pose, illumination, and expression variations.

03

Enhances facial structure learning through frequency-guided regularization.

Abstract

Recently, deep learning based facial landmark detection (FLD) methods have achieved considerable success. However, in challenging scenarios such as large pose variations, illumination changes, and facial expression variations, they still struggle to accurately capture the geometric structure of the face, resulting in performance degradation. Moreover, the limited size and diversity of existing FLD datasets hinder robust model training, leading to reduced detection accuracy. To address these challenges, we propose a Frequency-Guided Task-Balancing Transformer (FGTBT), which enhances facial structure perception through frequency-domain modeling and multi-dataset unified training. Specifically, we propose a novel Fine-Grained Multi-Task Balancing loss (FMB-loss), which moves beyond coarse task-level balancing by assigning weights to individual landmarks based on their occurrence across…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
xxxxxxxxy/FGTBT
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Face Recognition and Perception · Emotion and Mood Recognition