Deep Binary Reconstruction for Cross-modal Hashing

Xuelong Li; Di Hu; Feiping Nie

arXiv:1708.05127·cs.CV·August 25, 2017

Deep Binary Reconstruction for Cross-modal Hashing

Xuelong Li, Di Hu, Feiping Nie

PDF

1 Repo

TL;DR

This paper introduces Deep Binary Reconstruction (DBRC), a novel unsupervised deep learning approach for cross-modal hashing that directly learns binary codes using an adaptive activation function, improving retrieval performance.

Contribution

The paper proposes DBRC with the Adaptive Tanh activation for direct binary code learning, addressing limitations of previous relaxation-based methods.

Findings

01

DBRC outperforms state-of-the-art methods on benchmark datasets.

02

The Adaptive Tanh function effectively learns binary codes during training.

03

DBRC improves cross-modal retrieval accuracy in image2text and text2image tasks.

Abstract

With the increasing demand of massive multimodal data storage and organization, cross-modal retrieval based on hashing technique has drawn much attention nowadays. It takes the binary codes of one modality as the query to retrieve the relevant hashing codes of another modality. However, the existing binary constraint makes it difficult to find the optimal cross-modal hashing function. Most approaches choose to relax the constraint and perform thresholding strategy on the real-value representation instead of directly solving the original objective. In this paper, we first provide a concrete analysis about the effectiveness of multimodal networks in preserving the inter- and intra-modal consistency. Based on the analysis, we provide a so-called Deep Binary Reconstruction (DBRC) network that can directly learn the binary hashing codes in an unsupervised fashion. The superiority comes from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yolo2233/cross-modal-hasing-playground
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.