Learning Discriminative Representations for Semantic Cross Media Retrieval
Aiwen Jiang, Hanxi Li, Yi Li, Mingwen Wang

TL;DR
This paper introduces SDSRL, a novel method that learns a shared discriminative semantic space for cross-modal retrieval by linearly projecting heterogeneous data into a high-dimensional Hilbert space, improving retrieval performance.
Contribution
It proposes a new linear semantic projection approach in Hilbert space for cross-modal retrieval, enabling effective comparison of different modality contents.
Findings
Outperforms state-of-the-art methods on public datasets
Effective in both within- and inter-modal retrieval
Demonstrates robustness across multiple scenarios
Abstract
Heterogeneous gap among different modalities emerges as one of the critical issues in modern AI problems. Unlike traditional uni-modal cases, where raw features are extracted and directly measured, the heterogeneous nature of cross modal tasks requires the intrinsic semantic representation to be compared in a unified framework. This paper studies the learning of different representations that can be retrieved across different modality contents. A novel approach for mining cross-modal representations is proposed by incorporating explicit linear semantic projecting in Hilbert space. The insight is that the discriminative structures of different modality data can be linearly represented in appropriate high dimension Hilbert spaces, where linear operations can be used to approximate nonlinear decisions in the original spaces. As a result, an efficient linear semantic down mapping is jointly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Multimodal Machine Learning Applications
