T2LM: Long-Term 3D Human Motion Generation from Multiple Sentences
Taeryung Lee, Fabien Baradel, Thomas Lucas, Kyoung Mu Lee, and Gregory, Rogez

TL;DR
T2LM is a novel framework for long-term 3D human motion generation from multiple sentences, using a non-sequential training approach with a VQVAE and Transformer to produce smooth, realistic motion sequences.
Contribution
It introduces a non-sequential, continuous generation method combining VQVAE and Transformer models, enabling long-term motion synthesis without sequential training data.
Findings
Outperforms previous long-term motion generation models.
Produces smoother and more realistic motion sequences.
Competitive with state-of-the-art single-action models.
Abstract
In this paper, we address the challenging problem of long-term 3D human motion generation. Specifically, we aim to generate a long sequence of smoothly connected actions from a stream of multiple sentences (i.e., paragraph). Previous long-term motion generating approaches were mostly based on recurrent methods, using previously generated motion chunks as input for the next step. However, this approach has two drawbacks: 1) it relies on sequential datasets, which are expensive; 2) these methods yield unrealistic gaps between motions generated at each step. To address these issues, we introduce simple yet effective T2LM, a continuous long-term generation framework that can be trained without sequential data. T2LM comprises two components: a 1D-convolutional VQVAE, trained to compress motion to sequences of latent vectors, and a Transformer-based Text Encoder that predicts a latent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Human Motion and Animation · Video Analysis and Summarization
MethodsVQ-VAE
