EnergyMoGen: Compositional Human Motion Generation with Energy-Based Diffusion Model in Latent Space
Jianrong Zhang, Hehe Fan, Yi Yang

TL;DR
EnergyMoGen introduces a novel energy-based latent diffusion framework for compositional human motion generation, effectively combining multiple semantic concepts into coherent motion sequences and outperforming existing models.
Contribution
The paper proposes a new energy-based diffusion approach with synergistic energy fusion for improved semantic composition in human motion generation.
Findings
Outperforms state-of-the-art models in text-to-motion tasks
Enables complex, multi-concept motion synthesis
Enhances dataset extension and motion quality
Abstract
Diffusion models, particularly latent diffusion models, have demonstrated remarkable success in text-driven human motion generation. However, it remains challenging for latent diffusion models to effectively compose multiple semantic concepts into a single, coherent motion sequence. To address this issue, we propose EnergyMoGen, which includes two spectrums of Energy-Based Models: (1) We interpret the diffusion model as a latent-aware energy-based model that generates motions by composing a set of diffusion models in latent space; (2) We introduce a semantic-aware energy model based on cross-attention, which enables semantic composition and adaptive gradient descent for text embeddings. To overcome the challenges of semantic inconsistency and motion distortion across these two spectrums, we introduce Synergistic Energy Fusion. This design allows the motion latent diffusion model to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Human Pose and Action Recognition
MethodsSparse Evolutionary Training · Diffusion · Latent Diffusion Model
