Cross-modal Subspace Learning for Fine-grained Sketch-based Image Retrieval
Peng Xu, Qiyue Yin, Yongye Huang, Yi-Zhe Song, Zhanyu Ma, Liang Wang,, Tao Xiang, W. Bastiaan Kleijn, Jun Guo

TL;DR
This paper evaluates cross-modal subspace learning methods for fine-grained sketch-based image retrieval, demonstrating their effectiveness in bridging the domain gap between sketches and photos through extensive benchmarking.
Contribution
It introduces and compares state-of-the-art cross-modal subspace learning techniques specifically for fine-grained SBIR, providing insights and benchmarks on new datasets.
Findings
Subspace learning effectively models the sketch-photo domain gap.
Cross-modal methods outperform traditional SBIR approaches.
Benchmark results guide future research directions.
Abstract
Sketch-based image retrieval (SBIR) is challenging due to the inherent domain-gap between sketch and photo. Compared with pixel-perfect depictions of photos, sketches are iconic renderings of the real world with highly abstract. Therefore, matching sketch and photo directly using low-level visual clues are unsufficient, since a common low-level subspace that traverses semantically across the two modalities is non-trivial to establish. Most existing SBIR studies do not directly tackle this cross-modal problem. This naturally motivates us to explore the effectiveness of cross-modal retrieval methods in SBIR, which have been applied in the image-text matching successfully. In this paper, we introduce and compare a series of state-of-the-art cross-modal subspace learning methods and benchmark them on two recently released fine-grained SBIR datasets. Through thorough examination of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Image Retrieval and Classification Techniques
