Single-Shot Motion Completion with Transformer

Yinglin Duan (1); Tianyang Shi (1); Zhengxia Zou (2); Yenan Lin (3),; Zhehui Qian (3); Bohan Zhang (3); Yi Yuan (1) ((1) NetEase Fuxi AI Lab; (2); University of Michigan; (3) NetEase)

arXiv:2103.00776·cs.CV·March 2, 2021·29 cites

Single-Shot Motion Completion with Transformer

Yinglin Duan (1), Tianyang Shi (1), Zhengxia Zou (2), Yenan Lin (3),, Zhehui Qian (3), Bohan Zhang (3), Yi Yuan (1) ((1) NetEase Fuxi AI Lab, (2), University of Michigan, (3) NetEase)

PDF

Open Access 1 Repo

TL;DR

This paper introduces a transformer-based unified approach for various motion completion tasks, achieving state-of-the-art accuracy and real-time performance, with applications demonstrated in music-dance scenarios.

Contribution

The work presents a novel transformer framework that handles multiple motion completion scenarios within a single model, improving accuracy and efficiency over prior case-specific methods.

Findings

01

Achieves state-of-the-art accuracy across multiple motion completion benchmarks.

02

Operates in real-time with non-autoregressive prediction.

03

Effectively applied to music-dance motion synthesis.

Abstract

Motion completion is a challenging and long-discussed problem, which is of great significance in film and game applications. For different motion completion scenarios (in-betweening, in-filling, and blending), most previous methods deal with the completion problems with case-by-case designs. In this work, we propose a simple but effective method to solve multiple motion completion problems under a unified framework and achieves a new state of the art accuracy under multiple evaluation settings. Inspired by the recent great success of attention-based models, we consider the completion as a sequence to sequence prediction problem. Our method consists of two modules - a standard transformer encoder with self-attention that learns long-range dependencies of input motions, and a trainable mixture embedding module that models temporal information and discriminates key-frames. Our method can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

FuxiCV/SSMCT
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Human Motion and Animation · Human Pose and Action Recognition