Keypoint-Aligned Embeddings for Image Retrieval and Re-identification
Olga Moskvyak, Frederic Maire, Feras Dayoub, Mahsa Baktashmotlagh

TL;DR
This paper introduces KAE-Net, a novel embedding method that aligns image features with keypoints to improve pose invariance in image retrieval and re-identification, achieving state-of-the-art results.
Contribution
The paper presents KAE-Net, a compact, generic model that learns part-level features guided by keypoints, enhancing pose invariance in retrieval tasks.
Findings
Achieves state-of-the-art performance on CUB-200-2011, Cars196, and VeRi-776 datasets.
Effectively learns part-level features via multi-task learning with keypoint guidance.
Improves robustness to pose variations in re-identification tasks.
Abstract
Learning embeddings that are invariant to the pose of the object is crucial in visual image retrieval and re-identification. The existing approaches for person, vehicle, or animal re-identification tasks suffer from high intra-class variance due to deformable shapes and different camera viewpoints. To overcome this limitation, we propose to align the image embedding with a predefined order of the keypoints. The proposed keypoint aligned embeddings model (KAE-Net) learns part-level features via multi-task learning which is guided by keypoint locations. More specifically, KAE-Net extracts channels from a feature map activated by a specific keypoint through learning the auxiliary task of heatmap reconstruction for this keypoint. The KAE-Net is compact, generic and conceptually simple. It achieves state of the art performance on the benchmark datasets of CUB-200-2011, Cars196 and VeRi-776…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsHeatmap
