OpFlowTalker: Realistic and Natural Talking Face Generation via Optical Flow Guidance
Shuheng Ge, Haoyu Xing, Li Zhang, Xiangqian Wu

TL;DR
OpFlowTalker leverages optical flow guidance to generate realistic, temporally coherent talking face videos that improve lip-readability and semantic consistency, addressing previous limitations in frame-to-frame smoothness and visual quality.
Contribution
This paper introduces OpFlowTalker, a novel method using optical flow prediction from audio to enhance temporal coherence and semantic accuracy in talking face video synthesis.
Findings
Improved visual smoothness and lip-readability in synthesized videos.
Enhanced temporal coherence through sequence fusion and optical flow guidance.
Validated effectiveness with extensive empirical experiments.
Abstract
Creating realistic, natural, and lip-readable talking face videos remains a formidable challenge. Previous research primarily concentrated on generating and aligning single-frame images while overlooking the smoothness of frame-to-frame transitions and temporal dependencies. This often compromised visual quality and effects in practical settings, particularly when handling complex facial data and audio content, which frequently led to semantically incongruent visual illusions. Specifically, synthesized videos commonly featured disorganized lip movements, making them difficult to understand and recognize. To overcome these limitations, this paper introduces the application of optical flow to guide facial image generation, enhancing inter-frame continuity and semantic consistency. We propose "OpFlowTalker", a novel approach that utilizes predicted optical flow changes from audio inputs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Video Surveillance and Tracking Methods · Tactile and Sensory Interactions
