Deep Reversible Consistency Learning for Cross-modal Retrieval
Ruitao Pu, Yang Qin, Dezhong Peng, Xiaomin Song, Huiming Zheng

TL;DR
This paper introduces Deep Reversible Consistency Learning (DRCL), a novel approach for cross-modal retrieval that enhances semantic alignment and flexibility by recasting modality-invariant representations guided by labels.
Contribution
The proposed DRCL method innovatively combines selective prior learning and reversible semantic consistency to improve cross-modal retrieval performance.
Findings
DRCL outperforms 15 state-of-the-art methods on five datasets.
The feature augmentation mechanism increases diversity and robustness.
Extensive experiments validate the effectiveness of DRCL.
Abstract
Cross-modal retrieval (CMR) typically involves learning common representations to directly measure similarities between multimodal samples. Most existing CMR methods commonly assume multimodal samples in pairs and employ joint training to learn common representations, limiting the flexibility of CMR. Although some methods adopt independent training strategies for each modality to improve flexibility in CMR, they utilize the randomly initialized orthogonal matrices to guide representation learning, which is suboptimal since they assume inter-class samples are independent of each other, limiting the potential of semantic alignments between sample representations and ground-truth labels. To address these issues, we propose a novel method termed Deep Reversible Consistency Learning (DRCL) for cross-modal retrieval. DRCL includes two core modules, \ie Selective Prior Learning (SPL) and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
MethodsADaptive gradient method with the OPTimal convergence rate · Semi-Pseudo-Label
