Robust Human Motion Forecasting using Transformer-based Model

Esteve Valls Mascaro; Shuo Ma; Hyemin Ahn; Dongheui Lee

arXiv:2302.08274·cs.CV·April 9, 2024

Robust Human Motion Forecasting using Transformer-based Model

Esteve Valls Mascaro, Shuo Ma, Hyemin Ahn, Dongheui Lee

PDF

TL;DR

This paper introduces a lightweight, robust Transformer-based model for real-time 3D human motion forecasting that outperforms existing models in accuracy and efficiency, especially under occlusion and noisy conditions.

Contribution

The proposed 2-Channel Transformer (2CH-TR) is a novel model that effectively exploits spatio-temporal information for short and long-term human motion prediction, with improved robustness and speed.

Findings

01

Outperforms ST-Transformer in accuracy

02

Reduces mean squared error by 8.89% short-term

03

Operates efficiently in noisy, occluded environments

Abstract

Comprehending human motion is a fundamental challenge for developing Human-Robot Collaborative applications. Computer vision researchers have addressed this field by only focusing on reducing error in predictions, but not taking into account the requirements to facilitate its implementation in robots. In this paper, we propose a new model based on Transformer that simultaneously deals with the real time 3D human motion forecasting in the short and long term. Our 2-Channel Transformer (2CH-TR) is able to efficiently exploit the spatio-temporal information of a shortly observed sequence (400ms) and generates a competitive accuracy against the current state-of-the-art. 2CH-TR stands out for the efficient performance of the Transformer, being lighter and faster than its competitors. In addition, our model is tested in conditions where the human motion is severely occluded, demonstrating its…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMulti-Head Attention · Attention Is All You Need · Layer Normalization · Linear Layer · Dense Connections · Label Smoothing · Absolute Position Encodings · Adam · Position-Wise Feed-Forward Layer · Softmax