Being Comes from Not-being: Open-vocabulary Text-to-Motion Generation   with Wordless Training

Junfan Lin; Jianlong Chang; Lingbo Liu; Guanbin Li; Liang Lin; Qi; Tian; Chang Wen Chen

arXiv:2210.15929·cs.CV·March 27, 2023

Being Comes from Not-being: Open-vocabulary Text-to-Motion Generation with Wordless Training

Junfan Lin, Jianlong Chang, Lingbo Liu, Guanbin Li, Liang Lin, Qi, Tian, Chang Wen Chen

PDF

Open Access 1 Repo

TL;DR

This paper introduces a zero-shot, offline open-vocabulary text-to-motion generation method that uses prompt learning, a novel text-pose alignment model, and a wordless training mechanism to synthesize motions from text without paired data.

Contribution

It proposes a new framework combining prompt learning, a text-pose alignment model, and wordless training for open-vocabulary motion synthesis without paired training data.

Findings

01

Significant improvement over baseline methods.

02

Effective zero-shot text-to-motion generation.

03

Novel text-pose alignment and wordless training mechanisms.

Abstract

Text-to-motion generation is an emerging and challenging problem, which aims to synthesize motion with the same semantics as the input text. However, due to the lack of diverse labeled training data, most approaches either limit to specific types of text annotations or require online optimizations to cater to the texts during inference at the cost of efficiency and stability. In this paper, we investigate offline open-vocabulary text-to-motion generation in a zero-shot learning manner that neither requires paired training data nor extra online optimization to adapt for unseen texts. Inspired by the prompt learning in NLP, we pretrain a motion generator that learns to reconstruct the full motion from the masked motion. During inference, instead of changing the motion generator, our method reformulates the input text into a masked motion as the prompt for the motion generator to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

junfanlin/oohmg
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Multimodal Machine Learning Applications · Human Pose and Action Recognition

MethodsContrastive Language-Image Pre-training