Learning Discriminative Representations for Semantic Cross Media   Retrieval

Aiwen Jiang; Hanxi Li; Yi Li; Mingwen Wang

arXiv:1511.05659·cs.IR·November 19, 2015·1 cites

Learning Discriminative Representations for Semantic Cross Media Retrieval

Aiwen Jiang, Hanxi Li, Yi Li, Mingwen Wang

PDF

Open Access

TL;DR

This paper introduces SDSRL, a novel method that learns a shared discriminative semantic space for cross-modal retrieval by linearly projecting heterogeneous data into a high-dimensional Hilbert space, improving retrieval performance.

Contribution

It proposes a new linear semantic projection approach in Hilbert space for cross-modal retrieval, enabling effective comparison of different modality contents.

Findings

01

Outperforms state-of-the-art methods on public datasets

02

Effective in both within- and inter-modal retrieval

03

Demonstrates robustness across multiple scenarios

Abstract

Heterogeneous gap among different modalities emerges as one of the critical issues in modern AI problems. Unlike traditional uni-modal cases, where raw features are extracted and directly measured, the heterogeneous nature of cross modal tasks requires the intrinsic semantic representation to be compared in a unified framework. This paper studies the learning of different representations that can be retrieved across different modality contents. A novel approach for mining cross-modal representations is proposed by incorporating explicit linear semantic projecting in Hilbert space. The insight is that the discriminative structures of different modality data can be linearly represented in appropriate high dimension Hilbert spaces, where linear operations can be used to approximate nonlinear decisions in the original spaces. As a result, an efficient linear semantic down mapping is jointly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Multimodal Machine Learning Applications