Spatiotemporal-Untrammelled Mixture of Experts for Multi-Person Motion Prediction

Zheng Yin; Chengjian Li; Xiangbo Shu; Meiqi Cao; Rui Yan; Jinhui Tang

arXiv:2512.21707·cs.CV·December 29, 2025

Spatiotemporal-Untrammelled Mixture of Experts for Multi-Person Motion Prediction

Zheng Yin, Chengjian Li, Xiangbo Shu, Meiqi Cao, Rui Yan, Jinhui Tang

PDF

Open Access 1 Video

TL;DR

This paper introduces ST-MoE, a novel model that captures complex spatio-temporal dependencies in multi-person motion prediction more accurately and efficiently, reducing computational costs and outperforming existing methods.

Contribution

The paper proposes the Spatiotemporal-Untrammelled Mixture of Experts (ST-MoE), integrating diverse experts and bidirectional Mamba mechanisms to enhance modeling flexibility and efficiency in human motion prediction.

Findings

01

Outperforms state-of-the-art accuracy on four datasets.

02

Reduces model parameters by 41.38%.

03

Achieves 3.6x faster training speed.

Abstract

Comprehensively and flexibly capturing the complex spatio-temporal dependencies of human motion is critical for multi-person motion prediction. Existing methods grapple with two primary limitations: i) Inflexible spatiotemporal representation due to reliance on positional encodings for capturing spatiotemporal information. ii) High computational costs stemming from the quadratic time complexity of conventional attention mechanisms. To overcome these limitations, we propose the Spatiotemporal-Untrammelled Mixture of Experts (ST-MoE), which flexibly explores complex spatio-temporal dependencies in human motion and significantly reduces computational cost. To adaptively mine complex spatio-temporal patterns from human motion, our model incorporates four distinct types of spatiotemporal experts, each specializing in capturing different spatial or temporal dependencies. To reduce the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Spatiotemporal-Untrammelled Mixture of Experts for Multi-Person Motion Prediction· underline

Taxonomy

TopicsHuman Pose and Action Recognition · Human Motion and Animation · Gait Recognition and Analysis