InterMulti:Multi-view Multimodal Interactions with Text-dominated   Hierarchical High-order Fusion for Emotion Analysis

Feng Qiu; Wanzeng Kong; Yu Ding

arXiv:2212.10030·cs.AI·December 21, 2022·1 cites

InterMulti:Multi-view Multimodal Interactions with Text-dominated Hierarchical High-order Fusion for Emotion Analysis

Feng Qiu, Wanzeng Kong, Yu Ding

PDF

Open Access

TL;DR

InterMulti is a novel multimodal emotion analysis framework that captures complex interactions among speech, voice, and facial signals using hierarchical high-order fusion, significantly improving emotion recognition accuracy.

Contribution

The paper introduces a new hierarchical high-order fusion module that effectively integrates multimodal signals for emotion analysis, outperforming existing methods.

Findings

01

Outperforms state-of-the-art on MOSEI, MOSI, and IEMOCAP datasets.

02

Effectively captures complex multimodal interactions.

03

Balances modality contributions for improved emotion recognition.

Abstract

Humans are sophisticated at reading interlocutors' emotions from multimodal signals, such as speech contents, voice tones and facial expressions. However, machines might struggle to understand various emotions due to the difficulty of effectively decoding emotions from the complex interactions between multimodal signals. In this paper, we propose a multimodal emotion analysis framework, InterMulti, to capture complex multimodal interactions from different views and identify emotions from multimodal signals. Our proposed framework decomposes signals of different modalities into three kinds of multimodal interaction representations, including a modality-full interaction representation, a modality-shared interaction representation, and three modality-specific interaction representations. Additionally, to balance the contribution of different modalities and learn a more informative latent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition · Sentiment Analysis and Opinion Mining