Expression-preserving face frontalization improves visually assisted speech processing
Zhiqi Kang, Mostafa Sadeghi, Radu Horaud, Xavier Alameda-Pineda

TL;DR
This paper introduces a face frontalization method that preserves facial expressions by modeling non-rigid deformations, significantly enhancing visually assisted speech processing tasks like lip reading and speech enhancement.
Contribution
It proposes a novel deformation-preserving frontalization technique using a Student t-distribution and dynamic modeling, improving speech-related visual tasks over existing methods.
Findings
Improves word recognition in lip reading tasks.
Enhances speech intelligibility in noisy environments.
Outperforms state-of-the-art face frontalization methods.
Abstract
Face frontalization consists of synthesizing a frontally-viewed face from an arbitrarily-viewed one. The main contribution of this paper is a frontalization methodology that preserves non-rigid facial deformations in order to boost the performance of visually assisted speech communication. The method alternates between the estimation of (i)~the rigid transformation (scale, rotation, and translation) and (ii)~the non-rigid deformation between an arbitrarily-viewed face and a face model. The method has two important merits: it can deal with non-Gaussian errors in the data and it incorporates a dynamical face deformation model. For that purpose, we use the generalized Student t-distribution in combination with a linear dynamic system in order to account for both rigid head motions and time-varying facial deformations caused by speech production. We propose to use the zero-mean normalized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
