Fusion-supervised Deep Cross-modal Hashing

Li Wang; Lei Zhu; En Yu; Jiande Sun; Huaxiang Zhang

arXiv:1904.11171·cs.IR·April 2, 2020·1 cites

Fusion-supervised Deep Cross-modal Hashing

Li Wang, Lei Zhu, En Yu, Jiande Sun, Huaxiang Zhang

PDF

Open Access

TL;DR

This paper introduces FDCH, a novel deep hashing method that learns unified binary codes for cross-modal retrieval, effectively capturing multi-modal correlations and semantic information to improve retrieval accuracy.

Contribution

FDCH proposes a fusion hash network that enhances multi-modal correlation modeling and supervises modality-specific hash networks using high-quality unified codes.

Findings

01

Achieves state-of-the-art performance on benchmark datasets

02

Effectively models heterogeneous multi-modal correlations

03

Preserves semantic consistency in cross-modal retrieval

Abstract

Deep hashing has recently received attention in cross-modal retrieval for its impressive advantages. However, existing hashing methods for cross-modal retrieval cannot fully capture the heterogeneous multi-modal correlation and exploit the semantic information. In this paper, we propose a novel \emph{Fusion-supervised Deep Cross-modal Hashing} (FDCH) approach. Firstly, FDCH learns unified binary codes through a fusion hash network with paired samples as input, which effectively enhances the modeling of the correlation of heterogeneous multi-modal data. Then, these high-quality unified hash codes further supervise the training of the modality-specific hash networks for encoding out-of-sample queries. Meanwhile, both pair-wise similarity information and classification information are embedded in the hash networks under one stream framework, which simultaneously preserves cross-modal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Video Surveillance and Tracking Methods