WildActor: Unconstrained Identity-Preserving Video Generation
Qin Guo, Tianyu Yang, Xuanhua He, Fei Shen, Yong Zhang, Zhuoliang Kang, Xiaoming Wei, and Dan Xu

TL;DR
WildActor introduces a new framework for generating human videos that maintain consistent full-body identities across diverse viewpoints and motions, overcoming limitations of face-centric and pose-locked methods.
Contribution
The paper presents Actor-18M dataset and WildActor framework, enabling unconstrained, identity-preserving human video generation with novel attention and sampling strategies.
Findings
WildActor outperforms existing methods in identity preservation across viewpoints.
The Actor-18M dataset provides extensive data for training and evaluation.
WildActor maintains consistent full-body identities in complex motion and viewpoint changes.
Abstract
Production-ready human video generation requires digital actors to maintain strictly consistent full-body identities across dynamic shots, viewpoints and motions, a setting that remains challenging for existing methods. Prior methods often suffer from face-centric behavior that neglects body-level consistency, or produce copy-paste artifacts where subjects appear rigid due to pose locking. We present Actor-18M, a large-scale human video dataset designed to capture identity consistency under unconstrained viewpoints and environments. Actor-18M comprises 1.6M videos with 18M corresponding human images, covering both arbitrary views and canonical three-view representations. Leveraging Actor-18M, we propose WildActor, a framework for any-view conditioned human video generation. We introduce an Asymmetric Identity-Preserving Attention mechanism coupled with a Viewpoint-Adaptive Monte Carlo…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Human Pose and Action Recognition · Human Motion and Animation
