CoMo: Controllable Motion Generation through Language Guided Pose Code   Editing

Yiming Huang; Weilin Wan; Yue Yang; Chris Callison-Burch; Mark; Yatskar; Lingjie Liu

arXiv:2403.13900·cs.CV·September 20, 2024·1 cites

CoMo: Controllable Motion Generation through Language Guided Pose Code Editing

Yiming Huang, Weilin Wan, Yue Yang, Chris Callison-Burch, Mark, Yatskar, Lingjie Liu

PDF

Open Access

TL;DR

CoMo is a novel controllable motion generation model that uses language-guided pose codes for precise editing and generation of human motions, enabling fine-grained control and modifications.

Contribution

It introduces pose codes as interpretable representations and leverages large language models for direct motion editing, advancing controllability in text-to-motion models.

Findings

01

Achieves competitive motion generation performance.

02

Substantially outperforms previous models in motion editing.

03

Demonstrates effective language-guided motion editing.

Abstract

Text-to-motion models excel at efficient human motion generation, but existing approaches lack fine-grained controllability over the generation process. Consequently, modifying subtle postures within a motion or inserting new actions at specific moments remains a challenge, limiting the applicability of these methods in diverse scenarios. In light of these challenges, we introduce CoMo, a Controllable Motion generation model, adept at accurately generating and editing motions by leveraging the knowledge priors of large language models (LLMs). Specifically, CoMo decomposes motions into discrete and semantically meaningful pose codes, with each code encapsulating the semantics of a body part, representing elementary information such as "left knee slightly bent". Given textual inputs, CoMo autoregressively generates sequences of pose codes, which are then decoded into 3D motions.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Hand Gesture Recognition Systems · Human Pose and Action Recognition