TransFusion: A Practical and Effective Transformer-based Diffusion Model   for 3D Human Motion Prediction

Sibo Tian; Minghui Zheng; and Xiao Liang

arXiv:2307.16106·cs.RO·August 1, 2023

TransFusion: A Practical and Effective Transformer-based Diffusion Model for 3D Human Motion Prediction

Sibo Tian, Minghui Zheng, and Xiao Liang

PDF

Open Access 1 Repo

TL;DR

TransFusion introduces a Transformer-based diffusion model for 3D human motion prediction that balances accuracy and diversity, leveraging frequency domain modeling and lightweight design for improved performance.

Contribution

The paper presents a novel diffusion model using Transformers and frequency domain techniques for more accurate and diverse 3D human motion prediction, with a simplified input conditioning approach.

Findings

01

Outperforms existing models on benchmark datasets

02

Generates more realistic and diverse human motion sequences

03

Maintains high prediction accuracy with a lightweight architecture

Abstract

Predicting human motion plays a crucial role in ensuring a safe and effective human-robot close collaboration in intelligent remanufacturing systems of the future. Existing works can be categorized into two groups: those focusing on accuracy, predicting a single future motion, and those generating diverse predictions based on observations. The former group fails to address the uncertainty and multi-modal nature of human motion, while the latter group often produces motion sequences that deviate too far from the ground truth or become unrealistic within historical contexts. To tackle these issues, we propose TransFusion, an innovative and practical diffusion-based model for 3D human motion prediction which can generate samples that are more likely to happen while maintaining a certain level of diversity. Our model leverages Transformer as the backbone with long skip connections between…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sibotian96/TransFusion
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOptical Imaging and Spectroscopy Techniques · Human Pose and Action Recognition

MethodsMulti-Head Attention · Attention Is All You Need · Softmax · Position-Wise Feed-Forward Layer · Linear Layer · Dense Connections · Label Smoothing · Dropout · Adam · Absolute Position Encodings