Integrating Language Guidance into Vision-based Deep Metric Learning

Karsten Roth; Oriol Vinyals; Zeynep Akata

arXiv:2203.08543·cs.CV·March 17, 2022

Integrating Language Guidance into Vision-based Deep Metric Learning

Karsten Roth, Oriol Vinyals, Zeynep Akata

PDF

1 Repo

TL;DR

This paper introduces a language guidance objective for deep metric learning, using language embeddings to improve semantic consistency and generalization of visual similarity spaces, achieving state-of-the-art results.

Contribution

It proposes a novel language guidance approach that incorporates language embeddings into deep metric learning to enhance semantic understanding and transferability.

Findings

01

Significant improvements across all benchmarks.

02

Model-agnostic approach effective for various DML methods.

03

Achieved state-of-the-art performance on multiple datasets.

Abstract

Deep Metric Learning (DML) proposes to learn metric spaces which encode semantic similarities as embedding space distances. These spaces should be transferable to classes beyond those seen during training. Commonly, DML methods task networks to solve contrastive ranking tasks defined over binary class assignments. However, such approaches ignore higher-level semantic relations between the actual classes. This causes learned embedding spaces to encode incomplete semantic context and misrepresent the semantic relation between classes, impacting the generalizability of the learned metric space. To tackle this issue, we propose a language guidance objective for visual similarity learning. Leveraging language embeddings of expert- and pseudo-classnames, we contextualize and realign visual representation spaces corresponding to meaningful language semantics for better semantic consistency.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

explainableml/languageguidance_for_dml
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.