Deep Cross-Modal Hashing
Qing-Yuan Jiang, Wu-Jun Li

TL;DR
This paper introduces Deep Cross-Modal Hashing (DCMH), an end-to-end deep learning framework that jointly learns features and hash codes for improved multimedia retrieval performance.
Contribution
It presents a novel deep neural network-based approach that integrates feature learning and hash-code learning into a unified framework for cross-modal hashing.
Findings
DCMH outperforms existing methods on real datasets.
Achieves state-of-the-art accuracy in cross-modal retrieval.
End-to-end training improves feature and hash code quality.
Abstract
Due to its low storage cost and fast query speed, cross-modal hashing (CMH) has been widely used for similarity search in multimedia retrieval applications. However, almost all existing CMH methods are based on hand-crafted features which might not be optimally compatible with the hash-code learning procedure. As a result, existing CMH methods with handcrafted features may not achieve satisfactory performance. In this paper, we propose a novel cross-modal hashing method, called deep crossmodal hashing (DCMH), by integrating feature learning and hash-code learning into the same framework. DCMH is an end-to-end learning framework with deep neural networks, one for each modality, to perform feature learning from scratch. Experiments on two real datasets with text-image modalities show that DCMH can outperform other baselines to achieve the state-of-the-art performance in cross-modal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Video Analysis and Summarization · Video Surveillance and Tracking Methods
