Deep Sketch Hashing: Fast Free-hand Sketch-Based Image Retrieval
Li Liu, Fumin Shen, Yuming Shen, Xianglong Liu, and Ling Shao

TL;DR
This paper introduces Deep Sketch Hashing, a novel deep learning-based binary coding method that significantly accelerates free-hand sketch-based image retrieval by capturing cross-view similarities and semantic correlations with reduced computation.
Contribution
The paper presents the first end-to-end deep hashing framework specifically designed for category-level sketch-based image retrieval, incorporating auxiliary sketch-tokens to handle geometric distortions.
Findings
DSH achieves superior accuracy over state-of-the-art methods.
The method significantly reduces retrieval time and memory usage.
Evaluations on large-scale datasets demonstrate its effectiveness.
Abstract
Free-hand sketch-based image retrieval (SBIR) is a specific cross-view retrieval task, in which queries are abstract and ambiguous sketches while the retrieval database is formed with natural images. Work in this area mainly focuses on extracting representative and shared features for sketches and natural images. However, these can neither cope well with the geometric distortion between sketches and images nor be feasible for large-scale SBIR due to the heavy continuous-valued distance computation. In this paper, we speed up SBIR by introducing a novel binary coding method, named \textbf{Deep Sketch Hashing} (DSH), where a semi-heterogeneous deep architecture is proposed and incorporated into an end-to-end binary coding framework. Specifically, three convolutional neural networks are utilized to encode free-hand sketches, natural images and, especially, the auxiliary sketch-tokens which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization · Multimodal Machine Learning Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
