Light-T2M: A Lightweight and Fast Model for Text-to-motion Generation

Ling-An Zeng; Guohong Huang; Gaojie Wu; Wei-Shi Zheng

arXiv:2412.11193·cs.CV·December 17, 2024

Light-T2M: A Lightweight and Fast Model for Text-to-motion Generation

Ling-An Zeng, Guohong Huang, Gaojie Wu, Wei-Shi Zheng

PDF

Open Access 1 Repo 1 Video

TL;DR

Light-T2M is a novel lightweight and fast text-to-motion generation model that reduces parameters and inference time while improving motion quality by emphasizing local information and innovative textual integration.

Contribution

The paper introduces a lightweight model with a Local Information Modeling Module, Mamba, a Pseudo-bidirectional Scan, and an Adaptive Textual Information Injector for efficient T2M generation.

Findings

01

Parameters reduced to 10% of state-of-the-art

02

Inference time decreased by 16%

03

Achieved better FID scores on benchmark datasets

Abstract

Despite the significant role text-to-motion (T2M) generation plays across various applications, current methods involve a large number of parameters and suffer from slow inference speeds, leading to high usage costs. To address this, we aim to design a lightweight model to reduce usage costs. First, unlike existing works that focus solely on global information modeling, we recognize the importance of local information modeling in the T2M task by reconsidering the intrinsic properties of human motion, leading us to propose a lightweight Local Information Modeling Module. Second, we introduce Mamba to the T2M task, reducing the number of parameters and GPU memory demands, and we have designed a novel Pseudo-bidirectional Scan to replicate the effects of a bidirectional scan without increasing parameter count. Moreover, we propose a novel Adaptive Textual Information Injector that more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

qinghuannn/light-t2m
pytorchOfficial

Videos

Light-T2M: A Lightweight and Fast Model for Text-to-motion Generation· underline

Taxonomy

TopicsNatural Language Processing Techniques · Multimodal Machine Learning Applications · Human Motion and Animation

MethodsMamba: Linear-Time Sequence Modeling with Selective State Spaces · Focus