Improving Multimodal Fusion with Hierarchical Mutual Information   Maximization for Multimodal Sentiment Analysis

Wei Han; Hui Chen; Soujanya Poria

arXiv:2109.00412·cs.CL·September 17, 2021

Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis

Wei Han, Hui Chen, Soujanya Poria

PDF

2 Repos

TL;DR

This paper introduces MultiModal InfoMax (MMIM), a framework that enhances multimodal sentiment analysis by hierarchically maximizing mutual information to preserve task-relevant features during fusion.

Contribution

The work proposes a novel MI maximization framework for multimodal fusion that maintains task-related information, improving sentiment analysis performance.

Findings

01

Improved sentiment analysis accuracy on benchmark datasets

02

Effective preservation of task-relevant information during fusion

03

Demonstrated superiority over existing fusion methods

Abstract

In multimodal sentiment analysis (MSA), the performance of a model highly depends on the quality of synthesized embeddings. These embeddings are generated from the upstream process called multimodal fusion, which aims to extract and combine the input unimodal raw data to produce a richer multimodal representation. Previous work either back-propagates the task loss or manipulates the geometric property of feature spaces to produce favorable fusion results, which neglects the preservation of critical task-related information that flows from input to the fusion results. In this work, we propose a framework named MultiModal InfoMax (MMIM), which hierarchically maximizes the Mutual Information (MI) in unimodal input pairs (inter-modality) and between multimodal fusion result and unimodal input in order to maintain task-related information through multimodal fusion. The framework is jointly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.