Semi-supervised Multimodal Hashing

Dayong Tian; Maoguo Gong; Deyun Zhou; Jiao Shi; Yu Lei

arXiv:1712.03404·cs.IR·December 12, 2017·2 cites

Semi-supervised Multimodal Hashing

Dayong Tian, Maoguo Gong, Deyun Zhou, Jiao Shi, Yu Lei

PDF

Open Access

TL;DR

This paper introduces a semi-supervised multimodal hashing method that effectively uses partial labels to generate binary codes for cross-modal data retrieval, reducing labeling effort while maintaining high performance.

Contribution

It proposes a novel semi-supervised approach that leverages fuzzy logic and label estimation to improve multimodal hashing with limited labeled data.

Findings

01

Achieves near-supervised performance with 50% labels

02

Outperforms some supervised methods with only 10% labels

03

Reduces need for extensive manual labeling in multimodal retrieval

Abstract

Retrieving nearest neighbors across correlated data in multiple modalities, such as image-text pairs on Facebook and video-tag pairs on YouTube, has become a challenging task due to the huge amount of data. Multimodal hashing methods that embed data into binary codes can boost the retrieving speed and reduce storage requirement. As unsupervised multimodal hashing methods are usually inferior to supervised ones, while the supervised ones requires too much manually labeled data, the proposed method in this paper utilizes a part of labels to design a semi-supervised multimodal hashing method. It first computes the transformation matrices for data matrices and label matrix. Then, with these transformation matrices, fuzzy logic is introduced to estimate a label matrix for unlabeled data. Finally, it uses the estimated label matrix to learn hashing functions for data in each modality to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Video Surveillance and Tracking Methods · Image Retrieval and Classification Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings