Emotion Based Hate Speech Detection using Multimodal Learning
Aneri Rana, Sonali Jha

TL;DR
This paper introduces a multimodal deep learning framework that combines emotional and semantic features to improve hate speech detection in multimedia content, addressing a gap in existing text-only approaches.
Contribution
It presents the first multimodal deep learning model for hate speech detection using emotional and semantic features, along with a new dataset for multimedia hate speech analysis.
Findings
Incorporating emotional features improves detection accuracy.
Multimodal approach outperforms text-only models.
New dataset enables research in multimedia hate speech detection.
Abstract
In recent years, monitoring hate speech and offensive language on social media platforms has become paramount due to its widespread usage among all age groups, races, and ethnicities. Consequently, there have been substantial research efforts towards automated detection of such content using Natural Language Processing (NLP). While successfully filtering textual data, no research has focused on detecting hateful content in multimedia data. With increased ease of data storage and the exponential growth of social media platforms, multimedia content proliferates the internet as much as text data. Nevertheless, it escapes the automatic filtering systems. Hate speech and offensiveness can be detected in multimedia primarily via three modalities, i.e., visual, acoustic, and verbal. Our preliminary study concluded that the most essential features in classifying hate speech would be the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Internet Traffic Analysis and Secure E-voting
