Multi-attention Recurrent Network for Human Communication Comprehension

Amir Zadeh; Paul Pu Liang; Soujanya Poria; Prateek Vij; Erik Cambria,; Louis-Philippe Morency

arXiv:1802.00923·cs.AI·February 6, 2018

Multi-attention Recurrent Network for Human Communication Comprehension

Amir Zadeh, Paul Pu Liang, Soujanya Poria, Prateek Vij, Erik Cambria,, Louis-Philippe Morency

PDF

2 Repos

TL;DR

This paper introduces MARN, a neural network architecture that effectively models interactions between language, vision, and acoustic modalities over time to understand human communication, achieving state-of-the-art results.

Contribution

The paper proposes the Multi-attention Recurrent Network (MARN), a novel neural architecture that captures multimodal interactions through time using multi-attention and hybrid memory components.

Findings

01

MARN achieves state-of-the-art performance on six multimodal datasets.

02

The model effectively captures interactions between modalities over time.

03

MARN outperforms existing methods in sentiment, emotion, and trait recognition.

Abstract

Human face-to-face communication is a complex multimodal signal. We use words (language modality), gestures (vision modality) and changes in tone (acoustic modality) to convey our intentions. Humans easily process and understand face-to-face communication, however, comprehending this form of communication remains a significant challenge for Artificial Intelligence (AI). AI must understand each modality and the interactions between them that shape human communication. In this paper, we present a novel neural architecture for understanding human communication called the Multi-attention Recurrent Network (MARN). The main strength of our model comes from discovering interactions between modalities through time using a neural component called the Multi-attention Block (MAB) and storing them in the hybrid memory of a recurrent component called the Long-short Term Hybrid Memory (LSTHM). We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.