ISExplore:Informative Segment Selection for Efficient Personalized 3D Talking Face Generation
Rui-Qing Sun, Ang Li, Zhijing Wu, Tian Lan, Qianyu Lu, Xingshan Yao, Chen Xu, Xian-Ling Mao

TL;DR
This paper shows that a few seconds of carefully selected reference video can produce personalized 3D talking face models comparable to those using full-length videos, significantly reducing data processing time.
Contribution
The authors introduce ISExplore, a segment selection method that identifies the most informative reference segment, improving efficiency without sacrificing quality in 3D talking face generation.
Findings
ISExplore reduces data processing and training time by over 5x.
Selected short segments achieve comparable quality to full videos.
The method enhances practicality of personalized 3D talking face models.
Abstract
Talking Face Generation (TFG) methods based on Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS) have recently achieved impressive progress in personalized talking head synthesis. However, existing methods typically require several minutes of reference video for meticulous preprocessing and fitting, resulting in hours of preparation time and limiting their practical applicability. In this paper, we revisit a fundamental yet underexplored question: do high-quality personalized TFG models truly require minutes-long reference videos? Our exploratory study reveals that a carefully selected reference segment of only a few seconds can often achieve performance comparable to that of using the full reference video. This finding suggests that the informativeness of reference data is more critical than its duration. Motivated by this observation, we propose ISExplore (Informative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
