Single-Shot Motion Completion with Transformer
Yinglin Duan (1), Tianyang Shi (1), Zhengxia Zou (2), Yenan Lin (3),, Zhehui Qian (3), Bohan Zhang (3), Yi Yuan (1) ((1) NetEase Fuxi AI Lab, (2), University of Michigan, (3) NetEase)

TL;DR
This paper introduces a transformer-based unified approach for various motion completion tasks, achieving state-of-the-art accuracy and real-time performance, with applications demonstrated in music-dance scenarios.
Contribution
The work presents a novel transformer framework that handles multiple motion completion scenarios within a single model, improving accuracy and efficiency over prior case-specific methods.
Findings
Achieves state-of-the-art accuracy across multiple motion completion benchmarks.
Operates in real-time with non-autoregressive prediction.
Effectively applied to music-dance motion synthesis.
Abstract
Motion completion is a challenging and long-discussed problem, which is of great significance in film and game applications. For different motion completion scenarios (in-betweening, in-filling, and blending), most previous methods deal with the completion problems with case-by-case designs. In this work, we propose a simple but effective method to solve multiple motion completion problems under a unified framework and achieves a new state of the art accuracy under multiple evaluation settings. Inspired by the recent great success of attention-based models, we consider the completion as a sequence to sequence prediction problem. Our method consists of two modules - a standard transformer encoder with self-attention that learns long-range dependencies of input motions, and a trainable mixture embedding module that models temporal information and discriminates key-frames. Our method can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Human Motion and Animation · Human Pose and Action Recognition
