Teacher-Student Asynchronous Learning with Multi-Source Consistency for Facial Landmark Detection
Rongye Meng, Sanping Zhou, Xingyu Wan, Mengliu Li, Jinjun Wang

TL;DR
This paper introduces a teacher-student asynchronous learning framework for facial landmark detection that effectively filters noise in multi-source supervision signals, leading to state-of-the-art results on multiple benchmarks.
Contribution
It proposes a novel asynchronous learning framework utilizing multi-source supervision consistency, improving pseudo-label quality and detection accuracy.
Findings
Achieves state-of-the-art performance on 300W, AFLW, and 300VW benchmarks.
Effectively filters noise in multi-source supervision signals.
Demonstrates robustness and improved accuracy in facial landmark detection.
Abstract
Due to the high annotation cost of large-scale facial landmark detection tasks in videos, a semi-supervised paradigm that uses self-training for mining high-quality pseudo-labels to participate in training has been proposed by researchers. However, self-training based methods often train with a gradually increasing number of samples, whose performances vary a lot depending on the number of pseudo-labeled samples added. In this paper, we propose a teacher-student asynchronous learning~(TSAL) framework based on the multi-source supervision signal consistency criterion, which implicitly mines pseudo-labels through consistency constraints. Specifically, the TSAL framework contains two models with exactly the same structure. The radical student uses multi-source supervision signals from the same task to update parameters, while the calm teacher uses a single-source supervision signal to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Face and Expression Recognition · Speech and Audio Processing
