Trajectory-Aware Body Interaction Transformer for Multi-Person Pose   Forecasting

Xiaogang Peng; Siyuan Mao; Zizhao Wu

arXiv:2303.05095·cs.CV·March 14, 2023·1 cites

Trajectory-Aware Body Interaction Transformer for Multi-Person Pose Forecasting

Xiaogang Peng, Siyuan Mao, Zizhao Wu

PDF

Open Access 2 Repos

TL;DR

This paper introduces TBIFormer, a transformer-based model that effectively captures body part interactions for multi-person pose forecasting, outperforming existing methods on multiple datasets.

Contribution

The paper proposes a novel Trajectory-Aware Body Interaction Transformer with a body-part sequence representation and a new spatial encoding, advancing multi-person pose forecasting.

Findings

01

Outperforms state-of-the-art methods on multiple datasets

02

Effective modeling of body part interactions improves forecasting accuracy

03

Handles both short- and long-term pose prediction

Abstract

Multi-person pose forecasting remains a challenging problem, especially in modeling fine-grained human body interaction in complex crowd scenarios. Existing methods typically represent the whole pose sequence as a temporal series, yet overlook interactive influences among people based on skeletal body parts. In this paper, we propose a novel Trajectory-Aware Body Interaction Transformer (TBIFormer) for multi-person pose forecasting via effectively modeling body part interactions. Specifically, we construct a Temporal Body Partition Module that transforms all the pose sequences into a Multi-Person Body-Part sequence to retain spatial and temporal information based on body semantics. Then, we devise a Social Body Interaction Self-Attention (SBI-MSA) module, utilizing the transformed sequence to learn body part dynamics for inter- and intra-individual interactions. Furthermore, different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Human Motion and Animation · Virtual Reality Applications and Impacts

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Softmax · Dense Connections · Layer Normalization · Residual Connection · Byte Pair Encoding · Absolute Position Encodings