A Graph-Based Approach for Category-Agnostic Pose Estimation
Or Hirschorn, Shai Avidan

TL;DR
This paper introduces a graph-based, category-agnostic pose estimation method that leverages keypoint relationships to improve accuracy across diverse object categories with minimal training data.
Contribution
It presents a novel graph-based approach for category-agnostic pose estimation, outperforming existing methods on the MP-100 benchmark with minimal support images.
Findings
Achieves a 0.98% performance boost in 1-shot setting
Sets new state-of-the-art results for CAPE
Validates approach on a large, diverse dataset
Abstract
Traditional 2D pose estimation models are limited by their category-specific design, making them suitable only for predefined object categories. This restriction becomes particularly challenging when dealing with novel objects due to the lack of relevant training data. To address this limitation, category-agnostic pose estimation (CAPE) was introduced. CAPE aims to enable keypoint localization for arbitrary object categories using a few-shot single model, requiring minimal support images with annotated keypoints. We present a significant departure from conventional CAPE techniques, which treat keypoints as isolated entities, by treating the input pose data as a graph. We leverage the inherent geometrical relations between keypoints through a graph-based network to break symmetry, preserve structure, and better handle occlusions. We validate our approach on the MP-100 benchmark, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Multimodal Machine Learning Applications · Human Pose and Action Recognition
MethodsMulti-Head Attention · Laplacian EigenMap · Laplacian Positional Encodings · Dense Connections · Dropout · Byte Pair Encoding · Softmax · Layer Normalization · Linear Layer · Position-Wise Feed-Forward Layer
