KASportsFormer: Kinematic Anatomy Enhanced Transformer for 3D Human Pose Estimation on Short Sports Scene Video

Zhuoer Yin; Calvin Yeung; Tomohiro Suzuki; Ryota Tanaka; Keisuke Fujii

arXiv:2507.20763·cs.CV·July 29, 2025

KASportsFormer: Kinematic Anatomy Enhanced Transformer for 3D Human Pose Estimation on Short Sports Scene Video

Zhuoer Yin, Calvin Yeung, Tomohiro Suzuki, Ryota Tanaka, Keisuke Fujii

PDF

Open Access

TL;DR

KASportsFormer is a transformer-based framework that leverages kinematic anatomy features to improve 3D human pose estimation in short sports videos, effectively handling motion blur, occlusions, and instantaneous actions.

Contribution

The paper introduces a novel kinematic anatomy-informed transformer model specifically designed for sports scenarios, enhancing pose estimation accuracy in challenging short video clips.

Findings

01

Achieves state-of-the-art MPJPE errors of 58.0mm and 34.3mm on SportsPose and WorldPose datasets.

02

Effectively captures instantaneous sports motions with improved understanding of kinematic features.

03

Demonstrates robustness against motion blur, occlusions, and domain shifts in sports videos.

Abstract

Recent transformer based approaches have demonstrated impressive performance in solving real-world 3D human pose estimation problems. Albeit these approaches achieve fruitful results on benchmark datasets, they tend to fall short of sports scenarios where human movements are more complicated than daily life actions, as being hindered by motion blur, occlusions, and domain shifts. Moreover, due to the fact that critical motions in a sports game often finish in moments of time (e.g., shooting), the ability to focus on momentary actions is becoming a crucial factor in sports analysis, where current methods appear to struggle with instantaneous scenarios. To overcome these limitations, we introduce KASportsFormer, a novel transformer based 3D pose estimation framework for sports that incorporates a kinematic anatomy-informed feature representation and integration module. In which the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Human Motion and Animation · Gait Recognition and Analysis