Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression
Zigang Geng, Ke Sun, Bin Xiao, Zhaoxiang Zhang, Jingdong Wang

TL;DR
This paper introduces DEKR, a novel bottom-up human pose estimation method using disentangled keypoint regression with adaptive convolutions, achieving superior accuracy over traditional detection and grouping approaches.
Contribution
The paper proposes a simple, effective disentangled keypoint regression framework that improves spatial accuracy in bottom-up pose estimation by focusing on keypoint regions with adaptive convolutions.
Findings
Outperforms keypoint detection and grouping methods on COCO and CrowdPose datasets.
Achieves state-of-the-art results in bottom-up human pose estimation.
Demonstrates the effectiveness of disentangled representations for keypoint regression.
Abstract
In this paper, we are interested in the bottom-up paradigm of estimating human poses from an image. We study the dense keypoint regression framework that is previously inferior to the keypoint detection and grouping framework. Our motivation is that regressing keypoint positions accurately needs to learn representations that focus on the keypoint regions. We present a simple yet effective approach, named disentangled keypoint regression (DEKR). We adopt adaptive convolutions through pixel-wise spatial transformer to activate the pixels in the keypoint regions and accordingly learn representations from them. We use a multi-branch structure for separate regression: each branch learns a representation with dedicated adaptive convolutions and regresses one keypoint. The resulting disentangled representations are able to attend to the keypoint regions, respectively, and thus the keypoint…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Advanced Vision and Imaging
MethodsSpatial Transformer
