MulSMo: Multimodal Stylized Motion Generation by Bidirectional Control   Flow

Zhe Li; Yisheng He; Lei Zhong; Weichao Shen; Qi Zuo; Lingteng Qiu,; Zilong Dong; Laurence Tianruo Yang; Weihao Yuan

arXiv:2412.09901·cs.CV·March 19, 2025

MulSMo: Multimodal Stylized Motion Generation by Bidirectional Control Flow

Zhe Li, Yisheng He, Lei Zhong, Weichao Shen, Qi Zuo, Lingteng Qiu,, Zilong Dong, Laurence Tianruo Yang, Weihao Yuan

PDF

Open Access

TL;DR

MulSMo introduces a bidirectional control flow for stylized motion generation, effectively integrating style and content across multiple modalities, and outperforms previous methods in quality and flexibility.

Contribution

This work proposes a novel bidirectional control flow mechanism for stylized motion generation, extending it to multiple modalities with contrastive learning for enhanced control.

Findings

01

Outperforms previous methods on various datasets

02

Enables multimodal style control in motion generation

03

Preserves style dynamics better in integrated motions

Abstract

Generating motion sequences conforming to a target style while adhering to the given content prompts requires accommodating both the content and style. In existing methods, the information usually only flows from style to content, which may cause conflict between the style and content, harming the integration. Differently, in this work we build a bidirectional control flow between the style and the content, also adjusting the style towards the content, in which case the style-content collision is alleviated and the dynamics of the style is better preserved in the integration. Moreover, we extend the stylized motion generation from one modality, i.e. the style motion, to multiple modalities including texts and images through contrastive learning, leading to flexible style control on the motion generation. Extensive experiments demonstrate that our method significantly outperforms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Video Analysis and Summarization · Speech and dialogue systems