Learning Discriminative Hashing Codes for Cross-Modal Retrieval based on   Multi-view Features

Jun Yu; Xiao-Jun Wu; and Josef Kittler

arXiv:1808.04152·cs.LG·January 7, 2020

Learning Discriminative Hashing Codes for Cross-Modal Retrieval based on Multi-view Features

Jun Yu, Xiao-Jun Wu, and Josef Kittler

PDF

Open Access

TL;DR

This paper introduces a multi-view hashing framework that leverages complementary features from images and texts to generate discriminative codes, significantly improving cross-modal retrieval performance.

Contribution

The proposed method uniquely combines multi-view feature fusion with a joint classifier and subspace learning framework for more effective hashing.

Findings

01

Outperforms state-of-the-art methods on multiple datasets

02

Effectively fuses multi-view features for richer representations

03

Achieves superior retrieval accuracy in cross-modal tasks

Abstract

Hashing techniques have been applied broadly in retrieval tasks due to their low storage requirements and high speed of processing. Many hashing methods based on a single view have been extensively studied for information retrieval. However, the representation capacity of a single view is insufficient and some discriminative information is not captured, which results in limited improvement. In this paper, we employ multiple views to represent images and texts for enriching the feature information. Our framework exploits the complementary information among multiple views to better learn the discriminative compact hash codes. A discrete hashing learning framework that jointly performs classifier learning and subspace learning is proposed to complete multiple search tasks simultaneously. Our framework includes two stages, namely a kernelization process and a quantization process.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Image Retrieval and Classification Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings