Multimodal Prediction based on Graph Representations

Icaro Cavalcante Dourado; Salvatore Tabbone; Ricardo da Silva Torres

arXiv:1912.10314·cs.CV·July 6, 2020·1 cites

Multimodal Prediction based on Graph Representations

Icaro Cavalcante Dourado, Salvatore Tabbone, Ricardo da Silva Torres

PDF

Open Access

TL;DR

This paper introduces a graph-based learning model for multimodal prediction tasks that effectively captures relationships between different data modalities, improving accuracy over existing fusion methods.

Contribution

The paper presents a novel rank-fusion graph approach that encodes multiple descriptors into a graph and projects it into a vector space for improved multimodal prediction.

Findings

01

Outperforms early and late fusion methods in various datasets

02

Effective across visual, textual, and multimodal features

03

Demonstrates superior accuracy compared to state-of-the-art techniques

Abstract

This paper proposes a learning model, based on rank-fusion graphs, for general applicability in multimodal prediction tasks, such as multimodal regression and image classification. Rank-fusion graphs encode information from multiple descriptors and retrieval models, thus being able to capture underlying relationships between modalities, samples, and the collection itself. The solution is based on the encoding of multiple ranks for a query (or test sample), defined according to different criteria, into a graph. Later, we project the generated graph into an induced vector space, creating fusion vectors, targeting broader generality and efficiency. A fusion vector estimator is then built to infer whether a multimodal input object refers to a class or not. Our method is capable of promoting a fusion model better than early-fusion and late-fusion alternatives. Performed experiments in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Graph Neural Networks · Advanced Text Analysis Techniques

MethodsTest