Massively Multi-Person 3D Human Motion Forecasting with Scene Context

Felix B Mueller; Julian Tanke; Juergen Gall

arXiv:2409.12189·cs.CV·September 19, 2024

Massively Multi-Person 3D Human Motion Forecasting with Scene Context

Felix B Mueller, Julian Tanke, Juergen Gall

PDF

Open Access 1 Repo

TL;DR

This paper introduces SAST, a scene-aware social transformer model that effectively forecasts long-term multi-person 3D human motion by incorporating scene context and interactions, outperforming previous methods on realism and diversity.

Contribution

The paper presents a novel scene-aware social transformer model that models interactions among varying numbers of people and objects for long-term 3D human motion prediction.

Findings

01

Outperforms previous models in realism and diversity metrics

02

Effectively models interactions among multiple people and objects

03

Achieves superior results on the Humans in Kitchens dataset

Abstract

Forecasting long-term 3D human motion is challenging: the stochasticity of human behavior makes it hard to generate realistic human motion from the input sequence alone. Information on the scene environment and the motion of nearby people can greatly aid the generation process. We propose a scene-aware social transformer model (SAST) to forecast long-term (10s) human motion motion. Unlike previous models, our approach can model interactions between both widely varying numbers of people and objects in a scene. We combine a temporal convolutional encoder-decoder architecture with a Transformer-based bottleneck that allows us to efficiently combine motion and scene information. We model the conditional motion distribution using denoising diffusion models. We benchmark our approach on the Humans in Kitchens dataset, which contains 1 to 16 persons and 29 to 50 objects that are visible…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

felixbmuller/sast
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Gait Recognition and Analysis

MethodsDiffusion