CLERF: Contrastive LEaRning for Full Range Head Pose Estimation
Ting-Ruen Wei, Haowei Liu, Huei-Chung Hu, Xuyang Wu, Yi Fang, Hsin-Tai, Wu

TL;DR
This paper presents CLERF, a contrastive learning framework that leverages 3D-aware GANs to improve full-range head pose estimation, especially under pose variations and data sparsity, achieving state-of-the-art results.
Contribution
Introducing a novel contrastive learning approach combined with 3D-aware GANs for accurate, full-range head pose estimation, including upside-down poses, outperforming existing models.
Findings
Performs on par with state-of-the-art on standard datasets.
Outperforms existing models on rotated/flipped images.
First to accurately predict full-range head poses including upside-down.
Abstract
We introduce a novel framework for representation learning in head pose estimation (HPE). Previously such a scheme was difficult due to head pose data sparsity, making triplet sampling infeasible. Recent progress in 3D generative adversarial networks (3D-aware GAN) has opened the door for easily sampling triplets (anchor, positive, negative). We perform contrastive learning on extensively augmented data including geometric transformations and demonstrate that contrastive learning allows networks to learn genuine features that contribute to accurate HPE. On the other hand, we observe that existing HPE works struggle to predict head poses as accurately when test image rotation matrices are slightly out of the training dataset distribution. Experiments show that our methodology performs on par with state-of-the-art models on standard test datasets and outperforms them when images are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Speech and Audio Processing · Hand Gesture Recognition Systems
MethodsContrastive Learning
