MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm
Ziyan Guo, Zeyu Hu, De Wen Soh, Na Zhao

TL;DR
MotionLab introduces a unified framework for human motion generation and editing, leveraging a novel Motion-Condition-Motion paradigm to enable versatile, efficient, and controllable motion tasks with improved generalization.
Contribution
The paper proposes the Motion-Condition-Motion paradigm and MotionLab framework, integrating rectified flows, a transformer, and curriculum learning for multi-task human motion generation and editing.
Findings
Effective multi-task learning and knowledge sharing across motion tasks
Strong generalization capabilities on multiple benchmarks
High inference efficiency and versatility in motion generation and editing
Abstract
Human motion generation and editing are key components of computer vision. However, current approaches in this field tend to offer isolated solutions tailored to specific tasks, which can be inefficient and impractical for real-world applications. While some efforts have aimed to unify motion-related tasks, these methods simply use different modalities as conditions to guide motion generation. Consequently, they lack editing capabilities, fine-grained control, and fail to facilitate knowledge sharing across tasks. To address these limitations and provide a versatile, unified framework capable of handling both human motion generation and editing, we introduce a novel paradigm: \textbf{Motion-Condition-Motion}, which enables the unified formulation of diverse tasks with three concepts: source motion, condition, and target motion. Based on this paradigm, we propose a unified framework,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Human Pose and Action Recognition · Advanced Vision and Imaging
MethodsAttention Is All You Need · Label Smoothing · Layer Normalization · Linear Layer · Byte Pair Encoding · Dense Connections · Residual Connection · Multi-Head Attention · Position-Wise Feed-Forward Layer · Adam
