Face-from-Depth for Head Pose Estimation on Depth Images
Guido Borghi, Matteo Fabbri, Roberto Vezzani, Simone Calderara, Rita, Cucchiara

TL;DR
This paper introduces a depth-image-based system for head and shoulder pose estimation using a CNN and face hallucination, achieving real-time performance and outperforming recent methods on multiple datasets.
Contribution
The novel framework combines depth-based pose estimation with a face hallucination module, improving accuracy and robustness in challenging conditions.
Findings
Outperforms recent state-of-the-art methods on public datasets.
Operates in real-time at over 30 frames per second.
Effectively handles challenging automotive-like scenarios.
Abstract
Depth cameras allow to set up reliable solutions for people monitoring and behavior understanding, especially when unstable or poor illumination conditions make unusable common RGB sensors. Therefore, we propose a complete framework for the estimation of the head and shoulder pose based on depth images only. A head detection and localization module is also included, in order to develop a complete end-to-end system. The core element of the framework is a Convolutional Neural Network, called POSEidon+, that receives as input three types of images and provides the 3D angles of the pose as output. Moreover, a Face-from-Depth component based on a Deterministic Conditional GAN model is able to hallucinate a face from the corresponding depth image. We empirically demonstrate that this positively impacts the system performances. We test the proposed framework on two public datasets, namely Biwi…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
