GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting
Bo Chen, Shoukang Hu, Qi Chen, Chenpeng Du, Ran Yi, Yanmin Qian, Xie, Chen

TL;DR
GSTalker is a novel 3D audio-driven talking face generation model that achieves fast training and real-time rendering by using Gaussian Splatting and deformation fields to synchronize facial movements with audio.
Contribution
The paper introduces GSTalker, which employs Gaussian Splatting and deformation fields for efficient, high-fidelity, audio-synchronized 3D talking face generation with significantly reduced training and rendering times.
Findings
Fast training within 40 minutes
Real-time rendering at 125 FPS
High-fidelity, audio-synchronized face generation
Abstract
We present GStalker, a 3D audio-driven talking face generation model with Gaussian Splatting for both fast training (40 minutes) and real-time rendering (125 FPS) with a 35 minute video for training material, in comparison with previous 2D and 3D NeRF-based modeling frameworks which require hours of training and seconds of rendering per frame. Specifically, GSTalker learns an audio-driven Gaussian deformation field to translate and transform 3D Gaussians to synchronize with audio information, in which multi-resolution hashing grid-based tri-plane and temporal smooth module are incorporated to learn accurate deformation for fine-grained facial details. In addition, a pose-conditioned deformation field is designed to model the stabilized torso. To enable efficient optimization of the condition Gaussian deformation field, we initialize 3D Gaussians by learning a coarse static…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Speech and Audio Processing · Video Surveillance and Tracking Methods
