Recognition of Freely Selected Keypoints on Human Limbs

Katja Ludwig; Daniel Kienzle; Rainer Lienhart

arXiv:2204.06326·cs.CV·April 14, 2022

Recognition of Freely Selected Keypoints on Human Limbs

Katja Ludwig, Daniel Kienzle, Rainer Lienhart

PDF

Open Access

TL;DR

This paper introduces a method using Vision Transformers to detect arbitrary keypoints on human limbs, extending beyond fixed keypoints in standard pose estimation datasets, enabling more flexible and detailed human pose analysis.

Contribution

It proposes two novel approaches to encode arbitrary limb keypoints within a Transformer-based architecture, allowing detection without retraining on new keypoints.

Findings

01

Achieves similar accuracy to TokenPose on fixed keypoints

02

Capable of detecting arbitrary limb keypoints

03

Does not require retraining for new keypoints

Abstract

Nearly all Human Pose Estimation (HPE) datasets consist of a fixed set of keypoints. Standard HPE models trained on such datasets can only detect these keypoints. If more points are desired, they have to be manually annotated and the model needs to be retrained. Our approach leverages the Vision Transformer architecture to extend the capability of the model to detect arbitrary keypoints on the limbs of persons. We propose two different approaches to encode the desired keypoints. (1) Each keypoint is defined by its position along the line between the two enclosing keypoints from the fixed set and its relative distance between this line and the edge of the limb. (2) Keypoints are defined as coordinates on a norm pose. Both approaches are based on the TokenPose architecture, while the keypoint tokens that correspond to the fixed keypoints are replaced with our novel module. Experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Gait Recognition and Analysis

MethodsAttention Is All You Need · Linear Layer · Absolute Position Encodings · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Dense Connections · Multi-Head Attention · Layer Normalization · Residual Connection · Softmax