TextIM: Part-aware Interactive Motion Synthesis from Text

Siyuan Fan; Bo Du; Xiantao Cai; Bo Peng; Longling Sun

arXiv:2408.03302·cs.CV·August 7, 2024

TextIM: Part-aware Interactive Motion Synthesis from Text

Siyuan Fan, Bo Du, Xiantao Cai, Bo Peng, Longling Sun

PDF

Open Access

TL;DR

TextIM is a new framework that synthesizes human interactive motions from text with precise part-level semantic alignment, improving realism and accuracy in complex interaction scenarios.

Contribution

It introduces a decoupled diffusion model and a part-aware spatial coherence module for detailed, semantically accurate motion synthesis from textual descriptions.

Findings

01

Significantly improves motion realism and semantic accuracy.

02

Effectively models complex interactions with deformable objects.

03

Outperforms existing methods in diverse interaction scenarios.

Abstract

In this work, we propose TextIM, a novel framework for synthesizing TEXT-driven human Interactive Motions, with a focus on the precise alignment of part-level semantics. Existing methods often overlook the critical roles of interactive body parts and fail to adequately capture and align part-level semantics, resulting in inaccuracies and even erroneous movement outcomes. To address these issues, TextIM utilizes a decoupled conditional diffusion framework to enhance the detailed alignment between interactive movements and corresponding semantic intents from textual descriptions. Our approach leverages large language models, functioning as a human brain, to identify interacting human body parts and to comprehend interaction semantics to generate complicated and subtle interactive motion. Guided by the refined movements of the interacting parts, TextIM further extends these movements into…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Human Pose and Action Recognition · 3D Shape Modeling and Analysis

MethodsDiffusion · Focus · ALIGN