Trajectory-Aware Body Interaction Transformer for Multi-Person Pose Forecasting
Xiaogang Peng, Siyuan Mao, Zizhao Wu

TL;DR
This paper introduces TBIFormer, a transformer-based model that effectively captures body part interactions for multi-person pose forecasting, outperforming existing methods on multiple datasets.
Contribution
The paper proposes a novel Trajectory-Aware Body Interaction Transformer with a body-part sequence representation and a new spatial encoding, advancing multi-person pose forecasting.
Findings
Outperforms state-of-the-art methods on multiple datasets
Effective modeling of body part interactions improves forecasting accuracy
Handles both short- and long-term pose prediction
Abstract
Multi-person pose forecasting remains a challenging problem, especially in modeling fine-grained human body interaction in complex crowd scenarios. Existing methods typically represent the whole pose sequence as a temporal series, yet overlook interactive influences among people based on skeletal body parts. In this paper, we propose a novel Trajectory-Aware Body Interaction Transformer (TBIFormer) for multi-person pose forecasting via effectively modeling body part interactions. Specifically, we construct a Temporal Body Partition Module that transforms all the pose sequences into a Multi-Person Body-Part sequence to retain spatial and temporal information based on body semantics. Then, we devise a Social Body Interaction Self-Attention (SBI-MSA) module, utilizing the transformed sequence to learn body part dynamics for inter- and intra-individual interactions. Furthermore, different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Human Motion and Animation · Virtual Reality Applications and Impacts
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dropout · Softmax · Dense Connections · Layer Normalization · Residual Connection · Byte Pair Encoding · Absolute Position Encodings
