Semantic Modeling of Textual Relationships in Cross-Modal Retrieval

Jing Yu; Chenghao Yang; Zengchang Qin; Zhuoqian Yang; Yue Hu and; Weifeng Zhang

arXiv:1810.13151·cs.MM·June 13, 2019·1 cites

Semantic Modeling of Textual Relationships in Cross-Modal Retrieval

Jing Yu, Chenghao Yang, Zengchang Qin, Zhuoqian Yang, Yue Hu and, Weifeng Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel cross-modal retrieval model that leverages a featured graph for textual relationships and dual-path neural networks to improve semantic similarity measurement between texts and images.

Contribution

It proposes a relation-aware text representation using GCNs and a joint learning framework for multi-modal features, outperforming existing models.

Findings

01

Achieved 3.4% and 6.3% higher accuracy on benchmark datasets.

02

Effectively models semantic, co-occurrence, and prior relations in text.

03

Outperforms state-of-the-art models in cross-modal retrieval.

Abstract

Feature modeling of different modalities is a basic problem in current research of cross-modal information retrieval. Existing models typically project texts and images into one embedding space, in which semantically similar information will have a shorter distance. Semantic modeling of textural relationships is notoriously difficult. In this paper, we propose an approach to model texts using a featured graph by integrating multi-view textual relationships including semantic relations, statistical co-occurrence, and prior relations in the knowledge base. A dual-path neural network is adopted to learn multi-modal representations of information and cross-modal similarity measure jointly. We use a Graph Convolutional Network (GCN) for generating relation-aware text representations, and use a Convolutional Neural Network (CNN) with non-linearities for image representations. The cross-modal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yzhq97/SCKR
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques