KETA: Kinematic-Phrases-Enhanced Text-to-Motion Generation via Fine-grained Alignment
Yu Jiang, Yixing Chen, Xingyang Li

TL;DR
KETA introduces a novel text-to-motion generation method that uses kinematic phrases as an intermediate representation, improving alignment and motion accuracy through fine-grained supervision and iterative refinement.
Contribution
This work presents KETA, a new approach that decomposes text into kinematic phrases and aligns them with motion segments, enhancing the quality and consistency of generated motions.
Findings
KETA achieves up to 1.19x better R precision.
KETA reduces FID values by up to 2.34x.
It outperforms existing T2M models in accuracy and quality.
Abstract
Motion synthesis plays a vital role in various fields of artificial intelligence. Among the various conditions of motion generation, text can describe motion details elaborately and is easy to acquire, making text-to-motion(T2M) generation important. State-of-the-art T2M techniques mainly leverage diffusion models to generate motions with text prompts as guidance, tackling the many-to-many nature of T2M tasks. However, existing T2M approaches face challenges, given the gap between the natural language domain and the physical domain, making it difficult to generate motions fully consistent with the texts. We leverage kinematic phrases(KP), an intermediate representation that bridges these two modalities, to solve this. Our proposed method, KETA, decomposes the given text into several decomposed texts via a language model. It trains an aligner to align decomposed texts with the KP…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Multimodal Machine Learning Applications · Handwritten Text Recognition Techniques
MethodsDiffusion · ALIGN · Balanced Selection · Kollen-Pollack Learning
