Re$^2$MoGen: Open-Vocabulary Motion Generation via LLM Reasoning and Physics-Aware Refinement
Jiakun Zheng, Ting Xiao, Shiqin Cao, Xinran Li, Zhe Wang, Chenjia Bai

TL;DR
Re$^2$MoGen is a novel framework that combines LLM reasoning, pose optimization, and physics-aware refinement to generate semantically meaningful and physically plausible motions from text descriptions, especially outside training data distribution.
Contribution
It introduces a three-stage process integrating LLM reasoning, pose optimization, and physics-based refinement for open-vocabulary text-to-motion generation, surpassing existing methods.
Findings
Achieves state-of-the-art performance in open-vocabulary motion generation.
Generates motions that are both semantically consistent and physically plausible.
Effectively handles descriptions significantly different from training data.
Abstract
Text-to-motion (T2M) generation aims to control the behavior of a target character via textual descriptions. Leveraging text-motion paired datasets, existing T2M models have achieved impressive performance in generating high-quality motions within the distribution of their training data. However, their performance deteriorates notably when the motion descriptions differ significantly from the training texts. To address this issue, we propose ReMoGen, a Reasoning and Refinement open-vocabulary Motion Generation framework that leverages enhanced Large Language Model (LLM) reasoning to generate an initial motion planning and then refine its physical plausibility via reinforcement learning (RL) post-training. Specifically, ReMoGen consists of three stages: We first employ Monte Carlo tree search to enhance the LLM's reasoning ability in generating reasonable keyframes of the motion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
