Multi-modal Face Pose Estimation with Multi-task Manifold Deep Learning
Chaoqun Hong, Jun Yu

TL;DR
This paper introduces a novel multi-modal, multi-task deep learning framework called M^2DL for face pose estimation, effectively handling complex backgrounds and various orientations by leveraging manifold regularization and multi-modal data.
Contribution
It proposes a new deep learning framework that combines multi-modal data and multi-task learning with manifold regularization for improved face pose estimation.
Findings
Outperforms existing methods on DPOSE, HPID, and BKHPD datasets
Effectively handles complex backgrounds and diverse face orientations
Demonstrates superior accuracy and robustness in benchmark tests
Abstract
Human face pose estimation aims at estimating the gazing direction or head postures with 2D images. It gives some very important information such as communicative gestures, saliency detection and so on, which attracts plenty of attention recently. However, it is challenging because of complex background, various orientations and face appearance visibility. Therefore, a descriptive representation of face images and mapping it to poses are critical. In this paper, we make use of multi-modal data and propose a novel face pose estimation method that uses a novel deep learning framework named Multi-task Manifold Deep Learning . It is based on feature extraction with improved deep neural networks and multi-modal mapping relationship with multi-task learning. In the proposed deep learning based framework, Manifold Regularized Convolutional Layers (MRCL) improve traditional convolutional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Video Surveillance and Tracking Methods · Face and Expression Recognition
