AnyCharV: Bootstrap Controllable Character Video Generation with Fine-to-Coarse Guidance
Zhao Wang, Hao Wen, Lingting Zhu, Chenming Shang, Yujiu Yang, Qi Dou

TL;DR
AnyCharV introduces a flexible, two-stage framework for controllable character video generation that effectively combines arbitrary source characters with target scenes using pose guidance, outperforming previous methods.
Contribution
The paper presents a novel two-stage training framework for flexible character video synthesis with improved detail preservation and control, addressing limitations of prior approaches.
Findings
Outperforms previous state-of-the-art methods in quality.
Enables synthesis with arbitrary source characters and scenes.
Uses a self-boosting mechanism for better detail retention.
Abstract
Character video generation is a significant real-world application focused on producing high-quality videos featuring specific characters. Recent advancements have introduced various control signals to animate static characters, successfully enhancing control over the generation process. However, these methods often lack flexibility, limiting their applicability and making it challenging for users to synthesize a source character into a desired target scene. To address this issue, we propose a novel framework, AnyCharV, that flexibly generates character videos using arbitrary source characters and target scenes, guided by pose information. Our approach involves a two-stage training process. In the first stage, we develop a base model capable of integrating the source character with the target scene using pose guidance. The second stage further bootstraps controllable generation through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Human Motion and Animation · Handwritten Text Recognition Techniques
MethodsBalanced Selection
