Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval
Chao Li, Cheng Deng, Ning Li, Wei Liu, Xinbo Gao and, Dacheng Tao

TL;DR
This paper introduces SSAH, a self-supervised adversarial hashing method that improves cross-modal retrieval by aligning semantic representations across different data modalities using adversarial networks and semantic supervision.
Contribution
It is among the first to integrate adversarial learning into self-supervised cross-modal hashing, enhancing semantic correlation and retrieval accuracy.
Findings
Outperforms state-of-the-art methods on three benchmark datasets.
Effectively bridges the modality gap in cross-modal retrieval.
Preserves semantic relationships in both semantic and Hamming spaces.
Abstract
Thanks to the success of deep learning, cross-modal retrieval has made significant progress recently. However, there still remains a crucial bottleneck: how to bridge the modality gap to further enhance the retrieval accuracy. In this paper, we propose a self-supervised adversarial hashing (\textbf{SSAH}) approach, which lies among the early attempts to incorporate adversarial learning into cross-modal hashing in a self-supervised fashion. The primary contribution of this work is that two adversarial networks are leveraged to maximize the semantic correlation and consistency of the representations between different modalities. In addition, we harness a self-supervised semantic network to discover high-level semantic information in the form of multi-label annotations. Such information guides the feature learning process and preserves the modality relationships in both the common semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Video Surveillance and Tracking Methods
