Follow-Your-MultiPose: Tuning-Free Multi-Character Text-to-Video Generation via Pose Guidance
Beiyuan Zhang, Yue Ma, Chunlei Fu, Xinyang Song, Zhenan Sun, Ziqiang, Li

TL;DR
This paper introduces a tuning-free multi-character text-to-video generation framework that uses pose guidance and text prompts to produce controllable videos with multiple characters, outperforming previous methods.
Contribution
A novel multi-character video generation method that is tuning-free, utilizing pose and text guidance with a multi-branch control module for fine-grained control.
Findings
Achieves precise multi-character control in generated videos.
Demonstrates superior quantitative performance over previous methods.
Proves generality across various personalized T2I models.
Abstract
Text-editable and pose-controllable character video generation is a challenging but prevailing topic with practical applications. However, existing approaches mainly focus on single-object video generation with pose guidance, ignoring the realistic situation that multi-character appear concurrently in a scenario. To tackle this, we propose a novel multi-character video generation framework in a tuning-free manner, which is based on the separated text and pose guidance. Specifically, we first extract character masks from the pose sequence to identify the spatial position for each generating character, and then single prompts for each character are obtained with LLMs for precise text guidance. Moreover, the spatial-aligned cross attention and multi-branch control module are proposed to generate fine grained controllable multi-character video. The visualized results of generating video…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Video Analysis and Summarization · Handwritten Text Recognition Techniques
MethodsSoftmax · Attention Is All You Need · Focus
