MulSMo: Multimodal Stylized Motion Generation by Bidirectional Control Flow
Zhe Li, Yisheng He, Lei Zhong, Weichao Shen, Qi Zuo, Lingteng Qiu,, Zilong Dong, Laurence Tianruo Yang, Weihao Yuan

TL;DR
MulSMo introduces a bidirectional control flow for stylized motion generation, effectively integrating style and content across multiple modalities, and outperforms previous methods in quality and flexibility.
Contribution
This work proposes a novel bidirectional control flow mechanism for stylized motion generation, extending it to multiple modalities with contrastive learning for enhanced control.
Findings
Outperforms previous methods on various datasets
Enables multimodal style control in motion generation
Preserves style dynamics better in integrated motions
Abstract
Generating motion sequences conforming to a target style while adhering to the given content prompts requires accommodating both the content and style. In existing methods, the information usually only flows from style to content, which may cause conflict between the style and content, harming the integration. Differently, in this work we build a bidirectional control flow between the style and the content, also adjusting the style towards the content, in which case the style-content collision is alleviated and the dynamics of the style is better preserved in the integration. Moreover, we extend the stylized motion generation from one modality, i.e. the style motion, to multiple modalities including texts and images through contrastive learning, leading to flexible style control on the motion generation. Extensive experiments demonstrate that our method significantly outperforms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Video Analysis and Summarization · Speech and dialogue systems
