Multimodal similarity-preserving hashing
Jonathan Masci, Michael M. Bronstein, Alexander A. Bronstein and, J\"urgen Schmidhuber

TL;DR
This paper presents a novel multimodal hashing framework using a coupled siamese neural network that effectively learns intra- and inter-modality similarities, outperforming existing methods in multimedia retrieval.
Contribution
It introduces a flexible, neural network-based hashing approach capable of complex representations for multimodal data, surpassing prior linear projection methods.
Findings
Significantly outperforms state-of-the-art hashing methods
Effective for multimedia retrieval tasks
Supports complex, non-linear hashing functions
Abstract
We introduce an efficient computational framework for hashing data belonging to multiple modalities into a single representation space where they become mutually comparable. The proposed approach is based on a novel coupled siamese neural network architecture and allows unified treatment of intra- and inter-modality similarity learning. Unlike existing cross-modality similarity learning approaches, our hashing functions are not limited to binarized linear projections and can assume arbitrarily complex forms. We show experimentally that our method significantly outperforms state-of-the-art hashing approaches on multimedia retrieval tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Video Analysis and Summarization · Image Retrieval and Classification Techniques
