Multimodal Sentiment Analysis: Addressing Key Issues and Setting up the Baselines
Soujanya Poria, Navonil Majumder, Devamanyu Hazarika, Erik Cambria,, Alexander Gelbukh, Amir Hussain

TL;DR
This paper establishes baseline models and discusses key issues in multimodal sentiment analysis, providing a comprehensive framework and benchmarks for future research in this evolving field.
Contribution
It introduces three improved deep-learning architectures for multimodal sentiment classification and highlights important research considerations often overlooked.
Findings
Improved accuracy with successive architectures
Evaluation across multiple datasets with fixed splits
Identification of key issues like modality importance and generalizability
Abstract
We compile baselines, along with dataset split, for multimodal sentiment analysis. In this paper, we explore three different deep-learning based architectures for multimodal sentiment classification, each improving upon the previous. Further, we evaluate these architectures with multiple datasets with fixed train/test partition. We also discuss some major issues, frequently ignored in multimodal sentiment analysis research, e.g., role of speaker-exclusive models, importance of different modalities, and generalizability. This framework illustrates the different facets of analysis to be considered while performing multimodal sentiment analysis and, hence, serves as a new benchmark for future research in this emerging field.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
