Adapt and Align to Improve Zero-Shot Sketch-Based Image Retrieval

Shiyin Dong; Mingrui Zhu; Nannan Wang; Xinbo Gao

arXiv:2305.05144·cs.CV·August 10, 2023·1 cites

Adapt and Align to Improve Zero-Shot Sketch-Based Image Retrieval

Shiyin Dong, Mingrui Zhu, Nannan Wang, Xinbo Gao

PDF

Open Access

TL;DR

This paper introduces an 'Adapt and Align' method for zero-shot sketch-based image retrieval, using lightweight domain adapters and semantic alignment with text embeddings to improve cross-domain and unseen class retrieval performance.

Contribution

The paper proposes a novel 'Adapt and Align' approach that incorporates lightweight domain adapters and explicit semantic alignment to enhance zero-shot sketch-based image retrieval.

Findings

01

Outperforms previous methods on three benchmark datasets.

02

Improves cross-domain representation capabilities.

03

Demonstrates flexibility with different backbones.

Abstract

Zero-shot sketch-based image retrieval (ZS-SBIR) is challenging due to the cross-domain nature of sketches and photos, as well as the semantic gap between seen and unseen image distributions. Previous methods fine-tune pre-trained models with various side information and learning strategies to learn a compact feature space that is shared between the sketch and photo domains and bridges seen and unseen classes. However, these efforts are inadequate in adapting domains and transferring knowledge from seen to unseen classes. In this paper, we present an effective ``Adapt and Align'' approach to address the key challenges. Specifically, we insert simple and lightweight domain adapters to learn new abstract concepts of the sketch domain and improve cross-domain representation capabilities. Inspired by recent advances in image-text foundation models (e.g., CLIP) on zero-shot scenarios, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning

MethodsALIGN