CrowdMoGen: Zero-Shot Text-Driven Collective Motion Generation
Yukang Cao, Xinying Guo, Mingyuan Zhang, Haozhe Xie, Chenyang Gu, Ziwei Liu

TL;DR
CrowdMoGen is a novel zero-shot framework that uses large language models and SMPL priors to generate realistic, event-aligned collective crowd motions from text prompts, addressing scalability and controllability challenges.
Contribution
It introduces the first zero-shot collective motion generation framework combining LLMs for scene planning and a transformer-based generator for realistic crowd motions.
Findings
Outperforms previous methods in realism and coherence
Effectively organizes individuals into groups using LLMs
Generates contextually appropriate, event-driven crowd motions
Abstract
While recent advances in text-to-motion generation have shown promising results, they typically assume all individuals are grouped as a single unit. Scaling these methods to handle larger crowds and ensuring that individuals respond appropriately to specific events remains a significant challenge. This is primarily due to the complexities of scene planning, which involves organizing groups, planning their activities, and coordinating interactions, and controllable motion generation. In this paper, we present CrowdMoGen, the first zero-shot framework for collective motion generation, which effectively groups individuals and generates event-aligned motion sequences from text prompts. 1) Being limited by the available datasets for training an effective scene planning module in a supervised manner, we instead propose a crowd scene planner that leverages pre-trained large language models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Interactive and Immersive Displays · Social Robot Interaction and HRI
MethodsFocus
