LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of Feature Similarity
Tejan Karmali, Abhinav Atrishi, Sai Sree Harsha, Susmit Agrawal, Varun, Jampani, R. Venkatesh Babu

TL;DR
LEAD is a self-supervised method that discovers landmarks in images by aligning feature similarity distributions, enhancing dense equivariant representations to improve landmark detection with limited annotations.
Contribution
The paper introduces a novel two-stage training approach that combines instance-level self-supervised learning with dense feature representation for landmark detection.
Findings
Improves landmark detection accuracy with fewer annotations.
Enhances generalization across scale variations.
Leverages dense equivariant features for better unsupervised landmark discovery.
Abstract
In this work, we introduce LEAD, an approach to discover landmarks from an unannotated collection of category-specific images. Existing works in self-supervised landmark detection are based on learning dense (pixel-level) feature representations from an image, which are further used to learn landmarks in a semi-supervised manner. While there have been advances in self-supervised learning of image features for instance-level tasks like classification, these methods do not ensure dense equivariant representations. The property of equivariance is of interest for dense prediction tasks like landmark estimation. In this work, we introduce an approach to enhance the learning of dense equivariant representations in a self-supervised fashion. We follow a two-stage training approach: first, we train a network using the BYOL objective which operates at an instance level. The correspondences…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of Feature Similarity· youtube
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications
MethodsBootstrap Your Own Latent
