Human MotionFormer: Transferring Human Motions with Vision Transformers

Hongyu Liu; Xintong Han; Chengbin Jin; Lihui Qian; Huawei; Wei; Zhe Lin; Faqiang Wang; Haoye Dong; Yibing Song; Jia Xu; and Qifeng Chen

arXiv:2302.11306·cs.CV·February 28, 2023·5 cites

Human MotionFormer: Transferring Human Motions with Vision Transformers

Hongyu Liu, Xintong Han, Chengbin Jin, Lihui Qian, Huawei, Wei, Zhe Lin, Faqiang Wang, Haoye Dong, Yibing Song, Jia Xu, and Qifeng Chen

PDF

Open Access 1 Repo 1 Video

TL;DR

Human MotionFormer introduces a hierarchical Vision Transformer framework that effectively captures both large and subtle human motion details for high-quality motion transfer, setting new state-of-the-art results.

Contribution

The paper proposes a novel hierarchical ViT architecture with global and local perception modules and a mutual learning loss for improved human motion transfer.

Findings

01

Achieves state-of-the-art performance in motion transfer quality.

02

Effectively captures both large and subtle motions.

03

Demonstrates superior qualitative and quantitative results.

Abstract

Human motion transfer aims to transfer motions from a target dynamic person to a source static one for motion synthesis. An accurate matching between the source person and the target motion in both large and subtle motion changes is vital for improving the transferred motion quality. In this paper, we propose Human MotionFormer, a hierarchical ViT framework that leverages global and local perceptions to capture large and subtle motion matching, respectively. It consists of two ViT encoders to extract input features (i.e., a target motion image and a source human image) and a ViT decoder with several cascaded blocks for feature matching and motion transfer. In each block, we set the target motion feature as Query and the source person as Key and Value, calculating the cross-attention maps to conduct a global feature matching. Further, we introduce a convolutional layer to improve the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kumapowerliu/human-motionformer
noneOfficial

Videos

Human MotionFormer: Transferring Human Motions with Vision Transformers· slideslive

Taxonomy

TopicsHuman Pose and Action Recognition · Human Motion and Animation · Advanced Vision and Imaging