KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution
Antoni Bigata, Rodrigo Mira, Stella Bounareli, Micha{\l}, Stypu{\l}kowski, Konstantinos Vougioukas, Stavros Petridis, Maja Pantic

TL;DR
KeySync is a novel two-stage framework that enhances lip synchronization in videos by ensuring temporal consistency, reducing expression leakage, and effectively handling occlusions, thereby advancing the quality and robustness of talking head generation.
Contribution
It introduces a new masking strategy and architectural improvements that address leakage and occlusions, achieving state-of-the-art lip synchronization results.
Findings
State-of-the-art lip reconstruction and cross-synchronization performance.
Significant reduction in expression leakage as measured by LipLeak.
Effective handling of facial occlusions through the proposed masking approach.
Abstract
Lip synchronization, known as the task of aligning lip movements in an existing video with new input audio, is typically framed as a simpler variant of audio-driven facial animation. However, as well as suffering from the usual issues in talking head generation (e.g., temporal consistency), lip synchronization presents significant new challenges such as expression leakage from the input video and facial occlusions, which can severely impact real-world applications like automated dubbing, but are often neglected in existing works. To address these shortcomings, we present KeySync, a two-stage framework that succeeds in solving the issue of temporal consistency, while also incorporating solutions for leakage and occlusions using a carefully designed masking strategy. We show that KeySync achieves state-of-the-art results in lip reconstruction and cross-synchronization, improving visual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
