How Self-Supervised Learning Can be Used for Fine-Grained Head Pose Estimation?
Mahdi Pourmirzaei, Farzaneh Esmaili, Ebrahim Mousavi, Sasan, Karamizadeh, Seyedehsamaneh Shojaeilangari

TL;DR
This paper explores the use of self-supervised learning (SSL) for fine-grained head pose estimation, demonstrating that combining SSL pre-training and auxiliary SSL losses significantly improves accuracy.
Contribution
It introduces a Hybrid Multi-Task Learning architecture applying SSL strategies for head pose estimation, showing improved performance over baseline methods.
Findings
SSL methods enhance transfer learning for HPE.
Combining SSL pre-training and auxiliary SSL losses yields best results.
Error reduction up to 23.1% on AFLW2000 and 14.2% on BIWI.
Abstract
The cost of head pose labeling is the main challenge of improving the fine-grained Head Pose Estimation (HPE). Although Self-Supervised Learning (SSL) can be a solution to the lack of huge amounts of labeled data, its efficacy for fine-grained HPE is not yet fully explored. This study aims to assess the usage of SSL in fine-grained HPE based on two scenarios: (1) using SSL for weights pre-training procedure, and (2) leveraging auxiliary SSL losses besides HPE. We design a Hybrid Multi-Task Learning (HMTL) architecture based on the ResNet50 backbone in which both strategies are applied. Our experimental results reveal that the combination of both scenarios is the best for HPE. Together, the average error rate is reduced up to 23.1% for AFLW2000 and 14.2% for BIWI benchmark compared to the baseline. Moreover, it is found that some SSL methods are more suitable for transfer learning, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Human Pose and Action Recognition
MethodsJigsaw
