A Universal Model for Cross Modality Mapping by Relational Reasoning

Zun Li; Congyan Lang; Liqian Liang; Tao Wang; Songhe Feng; Jun Wu; and; Yidong Li

arXiv:2102.13360·cs.CV·March 1, 2021·1 cites

A Universal Model for Cross Modality Mapping by Relational Reasoning

Zun Li, Congyan Lang, Liqian Liang, Tao Wang, Songhe Feng, Jun Wu, and, Yidong Li

PDF

Open Access

TL;DR

This paper introduces a universal graph-based relational reasoning network that models intra- and inter-instance relations to improve cross modality mapping across diverse tasks like image classification, social recommendation, and sound recognition.

Contribution

It proposes a GCN-based RR-Net that explicitly models intra- and inter-relations for cross modality mapping, addressing limitations of previous similarity-based methods.

Findings

01

Outperforms existing methods on multiple tasks

02

Demonstrates universality across different modalities

03

Effectively models complex relational structures

Abstract

With the aim of matching a pair of instances from two different modalities, cross modality mapping has attracted growing attention in the computer vision community. Existing methods usually formulate the mapping function as the similarity measure between the pair of instance features, which are embedded to a common space. However, we observe that the relationships among the instances within a single modality (intra relations) and those between the pair of heterogeneous instances (inter relations) are insufficiently explored in previous approaches. Motivated by this, we redefine the mapping function with relational reasoning via graph modeling, and further propose a GCN-based Relational Reasoning Network (RR-Net) in which inter and intra relations are efficiently computed to universally resolve the cross modality mapping problem. Concretely, we first construct two kinds of graph, i.e.,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Graph Neural Networks · Advanced Image and Video Retrieval Techniques