Expression-preserving face frontalization improves visually assisted   speech processing

Zhiqi Kang; Mostafa Sadeghi; Radu Horaud; Xavier Alameda-Pineda

arXiv:2204.02810·cs.CV·November 21, 2023

Expression-preserving face frontalization improves visually assisted speech processing

Zhiqi Kang, Mostafa Sadeghi, Radu Horaud, Xavier Alameda-Pineda

PDF

TL;DR

This paper introduces a face frontalization method that preserves facial expressions by modeling non-rigid deformations, significantly enhancing visually assisted speech processing tasks like lip reading and speech enhancement.

Contribution

It proposes a novel deformation-preserving frontalization technique using a Student t-distribution and dynamic modeling, improving speech-related visual tasks over existing methods.

Findings

01

Improves word recognition in lip reading tasks.

02

Enhances speech intelligibility in noisy environments.

03

Outperforms state-of-the-art face frontalization methods.

Abstract

Face frontalization consists of synthesizing a frontally-viewed face from an arbitrarily-viewed one. The main contribution of this paper is a frontalization methodology that preserves non-rigid facial deformations in order to boost the performance of visually assisted speech communication. The method alternates between the estimation of (i)~the rigid transformation (scale, rotation, and translation) and (ii)~the non-rigid deformation between an arbitrarily-viewed face and a face model. The method has two important merits: it can deal with non-Gaussian errors in the data and it incorporates a dynamical face deformation model. For that purpose, we use the generalized Student t-distribution in combination with a linear dynamic system in order to account for both rigid head motions and time-varying facial deformations caused by speech production. We propose to use the zero-mean normalized…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.