TL;DR
This paper introduces a divide and conquer approach to deep metric learning that hierarchically splits data and embedding spaces, leading to improved generalization and state-of-the-art results in image retrieval and clustering tasks.
Contribution
It proposes a hierarchical splitting method for embedding spaces and data subsets, enhancing the expressiveness and generalization of deep metric learning models.
Findings
Significant improvements on multiple datasets.
Enhanced generalization to unseen categories.
Versatile wrapper for existing DML methods.
Abstract
Deep metric learning (DML) is a cornerstone of many computer vision applications. It aims at learning a mapping from the input domain to an embedding space, where semantically similar objects are located nearby and dissimilar objects far from another. The target similarity on the training data is defined by user in form of ground-truth class labels. However, while the embedding space learns to mimic the user-provided similarity on the training data, it should also generalize to novel categories not seen during training. Besides user-provided groundtruth training labels, a lot of additional visual factors (such as viewpoint changes or shape peculiarities) exist and imply different notions of similarity between objects, affecting the generalization on the images unseen during training. However, existing approaches usually directly learn a single embedding space on all available training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
