Geometry-guided Dense Perspective Network for Speech-Driven Facial   Animation

Jingying Liu; Binyuan Hui; Kun Li; Yunke Liu; Yu-Kun Lai; Yuxiang; Zhang; Yebin Liu; Jingyu Yang

arXiv:2008.10004·cs.GR·August 25, 2020

Geometry-guided Dense Perspective Network for Speech-Driven Facial Animation

Jingying Liu, Binyuan Hui, Kun Li, Yunke Liu, Yu-Kun Lai, Yuxiang, Zhang, Yebin Liu, Jingyu Yang

PDF

TL;DR

This paper introduces GDPnet, a novel deep learning architecture that leverages geometry guidance and attention mechanisms to produce realistic, speaker-independent 3D facial animations driven by speech, with improved accuracy and generalization.

Contribution

The paper presents a new geometry-guided dense perspective network with attention and non-linear face reconstruction for enhanced speech-driven 3D facial animation.

Findings

01

GDPnet outperforms state-of-the-art models on public and real datasets.

02

The geometry-guided approach improves deformation accuracy and generalization.

03

Attention mechanisms enhance feature response calibration.

Abstract

Realistic speech-driven 3D facial animation is a challenging problem due to the complex relationship between speech and face. In this paper, we propose a deep architecture, called Geometry-guided Dense Perspective Network (GDPnet), to achieve speaker-independent realistic 3D facial animation. The encoder is designed with dense connections to strengthen feature propagation and encourage the re-use of audio features, and the decoder is integrated with an attention mechanism to adaptively recalibrate point-wise feature responses by explicitly modeling interdependencies between different neuron units. We also introduce a non-linear face reconstruction representation as a guidance of latent space to obtain more accurate deformation, which helps solve the geometry-related deformation and is good for generalization across subjects. Huber and HSIC (Hilbert-Schmidt Independence Criterion)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.