SegFace: Face Segmentation of Long-Tail Classes
Kartik Narayan, Vibashan VS, Vishal M. Patel

TL;DR
SegFace introduces a transformer-based face segmentation model that effectively handles long-tail classes, significantly improving segmentation accuracy for infrequent facial regions while maintaining high efficiency for real-time applications.
Contribution
This work is the first to apply transformer models with class-specific tokens to face parsing, addressing long-tail class segmentation issues and enabling efficient, high-performance face segmentation.
Findings
Achieves a mean F1 score of 88.96 on CelebAMask-HQ
Outperforms previous state-of-the-art models
Runs at 95.96 FPS on edge devices
Abstract
Face parsing refers to the semantic segmentation of human faces into key facial regions such as eyes, nose, hair, etc. It serves as a prerequisite for various advanced applications, including face editing, face swapping, and facial makeup, which often require segmentation masks for classes like eyeglasses, hats, earrings, and necklaces. These infrequently occurring classes are called long-tail classes, which are overshadowed by more frequently occurring classes known as head classes. Existing methods, primarily CNN-based, tend to be dominated by head classes during training, resulting in suboptimal representation for long-tail classes. Previous works have largely overlooked the problem of poor segmentation performance of long-tail classes. To address this issue, we propose SegFace, a simple and efficient approach that uses a lightweight transformer-based model which utilizes learnable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Machine Learning and Data Classification
MethodsFocus
