A Generative Framework for Self-Supervised Facial Representation Learning
Ruian He, Zhen Xing, Weimin Tan, Bo Yan

TL;DR
This paper introduces LatentFace, a generative self-supervised framework using a 3D-aware diffusion model to improve facial representation learning, achieving state-of-the-art results in FER and face verification.
Contribution
It proposes a novel 3D-aware generative approach with disentangled latent space for facial identity and expression, advancing self-supervised facial representation learning.
Findings
Achieves 3.75% higher FER accuracy on RAF-DB.
Achieves 3.35% higher FER accuracy on AffectNet.
Outperforms previous self-supervised methods in face verification.
Abstract
Self-supervised representation learning has gained increasing attention for strong generalization ability without relying on paired datasets. However, it has not been explored sufficiently for facial representation. Self-supervised facial representation learning remains unsolved due to the coupling of facial identities, expressions, and external factors like pose and light. Prior methods primarily focus on contrastive learning and pixel-level consistency, leading to limited interpretability and suboptimal performance. In this paper, we propose LatentFace, a novel generative framework for self-supervised facial representations. We suggest that the disentangling problem can be also formulated as generative objectives in space and time, and propose the solution using a 3D-aware latent diffusion model. First, we introduce a 3D-aware autoencoder to encode face images into 3D latent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Face and Expression Recognition · Facial Nerve Paralysis Treatment and Research
MethodsContrastive Learning · Focus · Diffusion
