Cross-modal Subspace Learning for Fine-grained Sketch-based Image   Retrieval

Peng Xu; Qiyue Yin; Yongye Huang; Yi-Zhe Song; Zhanyu Ma; Liang Wang,; Tao Xiang; W. Bastiaan Kleijn; Jun Guo

arXiv:1705.09888·cs.CV·May 30, 2017·6 cites

Cross-modal Subspace Learning for Fine-grained Sketch-based Image Retrieval

Peng Xu, Qiyue Yin, Yongye Huang, Yi-Zhe Song, Zhanyu Ma, Liang Wang,, Tao Xiang, W. Bastiaan Kleijn, Jun Guo

PDF

Open Access

TL;DR

This paper evaluates cross-modal subspace learning methods for fine-grained sketch-based image retrieval, demonstrating their effectiveness in bridging the domain gap between sketches and photos through extensive benchmarking.

Contribution

It introduces and compares state-of-the-art cross-modal subspace learning techniques specifically for fine-grained SBIR, providing insights and benchmarks on new datasets.

Findings

01

Subspace learning effectively models the sketch-photo domain gap.

02

Cross-modal methods outperform traditional SBIR approaches.

03

Benchmark results guide future research directions.

Abstract

Sketch-based image retrieval (SBIR) is challenging due to the inherent domain-gap between sketch and photo. Compared with pixel-perfect depictions of photos, sketches are iconic renderings of the real world with highly abstract. Therefore, matching sketch and photo directly using low-level visual clues are unsufficient, since a common low-level subspace that traverses semantically across the two modalities is non-trivial to establish. Most existing SBIR studies do not directly tackle this cross-modal problem. This naturally motivates us to explore the effectiveness of cross-modal retrieval methods in SBIR, which have been applied in the image-text matching successfully. In this paper, we introduce and compare a series of state-of-the-art cross-modal subspace learning methods and benchmark them on two recently released fine-grained SBIR datasets. Through thorough examination of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Image Retrieval and Classification Techniques