LwPosr: Lightweight Efficient Fine-Grained Head Pose Estimation
Naina Dhingra

TL;DR
This paper introduces LwPosr, a lightweight and efficient neural network combining depthwise separable convolutions and transformer encoders for fine-grained head pose estimation, suitable for mobile devices.
Contribution
LwPosr is the first to combine DSCs and transformers for head pose estimation, achieving state-of-the-art lightweight performance.
Findings
Outperforms previous lightweight models in accuracy.
Uses fewer parameters than existing methods.
Validated on three open-source datasets.
Abstract
This paper presents a lightweight network for head pose estimation (HPE) task. While previous approaches rely on convolutional neural networks, the proposed network \textit{LwPosr} uses mixture of depthwise separable convolutional (DSC) and transformer encoder layers which are structured in two streams and three stages to provide fine-grained regression for predicting head poses. The quantitative and qualitative demonstration is provided to show that the proposed network is able to learn head poses efficiently while using less parameter space. Extensive ablations are conducted using three open-source datasets namely 300W-LP, AFLW2000, and BIWI datasets. To our knowledge, (1) \textit{LwPosr} is the lightest network proposed for estimating head poses compared to both keypoints-based and keypoints-free approaches; (2) it sets a benchmark for both overperforming the previous lightweight…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Human Pose and Action Recognition · Orthodontics and Dentofacial Orthopedics
