Adaptive additive classification-based loss for deep metric learning
Istvan Fehervari, Ives Macedo

TL;DR
This paper introduces an adaptive additive loss for deep metric learning that uses separate margins per negative proxy, leveraging additional modalities for improved retrieval performance and faster convergence.
Contribution
It extends classification-based deep metric learning with a novel adaptive margin mechanism that incorporates cross-modal distance information for enhanced accuracy and efficiency.
Findings
Achieved new state-of-the-art on Amazon fashion and DeepFashion datasets.
Faster convergence and lower complexity compared to previous methods.
Effective use of textual modalities with fastText and BERT embeddings.
Abstract
Recent works have shown that deep metric learning algorithms can benefit from weak supervision from another input modality. This additional modality can be incorporated directly into the popular triplet-based loss function as distances. Also recently, classification loss and proxy-based metric learning have been observed to lead to faster convergence as well as better retrieval results, all the while without requiring complex and costly sampling strategies. In this paper we propose an extension to the existing adaptive margin for classification-based deep metric learning. Our extension introduces a separate margin for each negative proxy per sample. These margins are computed during training from precomputed distances of the classes in the other modality. Our results set a new state-of-the-art on both on the Amazon fashion retrieval dataset as well as on the public DeepFashion dataset.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Advanced Image and Video Retrieval Techniques
