A Novel Context-Aware Multimodal Framework for Persian Sentiment Analysis
Kia Dashtipour, Mandar Gogate, Erik Cambria, Amir Hussain

TL;DR
This paper introduces a new Persian multimodal sentiment analysis framework that combines audio, visual, and text data, along with a benchmark dataset, to improve sentiment detection accuracy in videos.
Contribution
It presents the first Persian multimodal dataset and a novel context-aware framework that effectively integrates multiple modalities for sentiment analysis.
Findings
Multimodal approach outperforms unimodal methods with 91.39% accuracy.
The framework effectively combines textual, acoustic, and visual cues.
Experimental results validate the superiority of contextual multimodal integration.
Abstract
Most recent works on sentiment analysis have exploited the text modality. However, millions of hours of video recordings posted on social media platforms everyday hold vital unstructured information that can be exploited to more effectively gauge public perception. Multimodal sentiment analysis offers an innovative solution to computationally understand and harvest sentiments from videos by contextually exploiting audio, visual and textual cues. In this paper, we, firstly, present a first of its kind Persian multimodal dataset comprising more than 800 utterances, as a benchmark resource for researchers to evaluate multimodal sentiment analysis approaches in Persian language. Secondly, we present a novel context-aware multimodal sentiment analysis framework, that simultaneously exploits acoustic, visual and textual cues to more accurately determine the expressed sentiment. We employ both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
