WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose

Yijun Zhou; James Gregson

arXiv:2005.10353·cs.CV·September 24, 2020·60 cites

WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose

Yijun Zhou, James Gregson

PDF

Open Access 4 Repos

TL;DR

WHENet is a real-time, compact neural network capable of accurately estimating head pose across the full yaw range from a single RGB image, suitable for mobile and real-world applications.

Contribution

The paper introduces WHENet, the first fine-grained head pose estimation network effective for the entire yaw range, with new training strategies and ground truth labeling from a panoptic dataset.

Findings

01

Outperforms existing methods on full-range head pose estimation

02

Achieves real-time inference suitable for mobile devices

03

Matches or exceeds state-of-the-art accuracy for frontal head pose

Abstract

We present an end-to-end head-pose estimation network designed to predict Euler angles through the full range head yaws from a single RGB image. Existing methods perform well for frontal views but few target head pose from all viewpoints. This has applications in autonomous driving and retail. Our network builds on multi-loss approaches with changes to loss functions and training strategies adapted to wide range estimation. Additionally, we extract ground truth labelings of anterior views from a current panoptic dataset for the first time. The resulting Wide Headpose Estimation Network (WHENet) is the first fine-grained modern method applicable to the full-range of head yaws (hence wide) yet also meets or beats state-of-the-art methods for frontal head pose estimation. Our network is compact and efficient for mobile devices and applications.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Human Pose and Action Recognition · Robot Manipulation and Learning