TEMPO: Efficient Multi-View Pose Estimation, Tracking, and Forecasting

Rohan Choudhury; Kris Kitani; Laszlo A. Jeni

arXiv:2309.07910·cs.CV·September 15, 2023·1 cites

TEMPO: Efficient Multi-View Pose Estimation, Tracking, and Forecasting

Rohan Choudhury, Kris Kitani, Laszlo A. Jeni

PDF

Open Access 1 Video

TL;DR

TEMPO is an efficient multi-view model that improves 3D human pose estimation, tracking, and forecasting by leveraging spatiotemporal features, achieving higher accuracy and speed without scene-specific tuning.

Contribution

We introduce TEMPO, a novel model that reduces computation while enhancing multi-view pose estimation, tracking, and forecasting through a unified spatiotemporal representation.

Findings

01

Achieves 10% better MPJPE than TesseTrack.

02

Provides a 33x increase in FPS.

03

Generalizes across datasets without fine-tuning.

Abstract

Existing volumetric methods for predicting 3D human pose estimation are accurate, but computationally expensive and optimized for single time-step prediction. We present TEMPO, an efficient multi-view pose estimation model that learns a robust spatiotemporal representation, improving pose accuracy while also tracking and forecasting human pose. We significantly reduce computation compared to the state-of-the-art by recurrently computing per-person 2D pose features, fusing both spatial and temporal information into a single representation. In doing so, our model is able to use spatiotemporal context to predict more accurate human poses without sacrificing efficiency. We further use this representation to track human poses over time as well as predict future poses. Finally, we demonstrate that our model is able to generalize across datasets without scene-specific fine-tuning. TEMPO…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

TEMPO: Efficient Multi-View Pose Estimation, Tracking, and Forecasting· youtube

Taxonomy

TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Hand Gesture Recognition Systems