KTPFormer: Kinematics and Trajectory Prior Knowledge-Enhanced   Transformer for 3D Human Pose Estimation

Jihua Peng; Yanghong Zhou; P.Y. Mok

arXiv:2404.00658·cs.CV·April 3, 2024·3 cites

KTPFormer: Kinematics and Trajectory Prior Knowledge-Enhanced Transformer for 3D Human Pose Estimation

Jihua Peng, Yanghong Zhou, P.Y. Mok

PDF

Open Access 1 Repo

TL;DR

KTPFormer introduces kinematic and trajectory prior modules into transformer architecture to enhance 3D human pose estimation by effectively modeling spatial and temporal dependencies, achieving superior results on benchmark datasets.

Contribution

The paper proposes two novel prior attention modules, KPA and TPA, that incorporate anatomical and motion trajectory knowledge into transformers for improved 3D human pose estimation.

Findings

01

Outperforms state-of-the-art methods on benchmarks

02

Modules are lightweight and easily integrable into existing models

03

Achieves significant accuracy improvements with minimal computational overhead

Abstract

This paper presents a novel Kinematics and Trajectory Prior Knowledge-Enhanced Transformer (KTPFormer), which overcomes the weakness in existing transformer-based methods for 3D human pose estimation that the derivation of Q, K, V vectors in their self-attention mechanisms are all based on simple linear mapping. We propose two prior attention modules, namely Kinematics Prior Attention (KPA) and Trajectory Prior Attention (TPA) to take advantage of the known anatomical structure of the human body and motion trajectory information, to facilitate effective learning of global dependencies and features in the multi-head self-attention. KPA models kinematic relationships in the human body by constructing a topology of kinematics, while TPA builds a trajectory topology to learn the information of joint motion trajectory across frames. Yielding Q, K, V vectors with prior knowledge, the two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

JihuaPeng/KTPFormer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Gait Recognition and Analysis · Hand Gesture Recognition Systems

MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Multi-Head Attention · Absolute Position Encodings · Softmax · Byte Pair Encoding · Dense Connections · Label Smoothing · Adam