LEAD: Latent Realignment for Human Motion Diffusion

Nefeli Andreou; Xi Wang; Victoria Fern\'andez Abrevaya; Marie-Paule; Cani; Yiorgos Chrysanthou; Vicky Kalogeiton

arXiv:2410.14508·cs.CV·October 21, 2024

LEAD: Latent Realignment for Human Motion Diffusion

Nefeli Andreou, Xi Wang, Victoria Fern\'andez Abrevaya, Marie-Paule, Cani, Yiorgos Chrysanthou, Vicky Kalogeiton

PDF

Open Access

TL;DR

LEAD introduces a novel latent realignment technique for human motion generation from text, enhancing semantic understanding, realism, and diversity in synthesized motions, and enabling effective motion inversion from few examples.

Contribution

It combines diffusion models with a realignment mechanism to create a semantically structured latent space, improving text-motion alignment and enabling motion inversion.

Findings

01

Comparable realism and diversity to state-of-the-art methods

02

Qualitative analysis shows sharper, more human-like motions

03

Improved motion inversion capturing out-of-distribution concepts

Abstract

Our goal is to generate realistic human motion from natural language. Modern methods often face a trade-off between model expressiveness and text-to-motion alignment. Some align text and motion latent spaces but sacrifice expressiveness; others rely on diffusion models producing impressive motions, but lacking semantic meaning in their latent space. This may compromise realism, diversity, and applicability. Here, we address this by combining latent diffusion with a realignment mechanism, producing a novel, semantically structured space that encodes the semantics of language. Leveraging this capability, we introduce the task of textual motion inversion to capture novel motion concepts from a few examples. For motion synthesis, we evaluate LEAD on HumanML3D and KIT-ML and show comparable performance to the state-of-the-art in terms of realism, diversity, and text-motion consistency. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Human Motion and Animation · Gait Recognition and Analysis

MethodsALIGN · Diffusion