Multilingual and Multimodal Abuse Detection
Rini Sharon, Heet Shah, Debdoot Mukherjee, Vikram Gupta

TL;DR
This paper introduces MADA, a multimodal approach for multilingual abuse detection in social media audio content, leveraging emotions and semantics to improve accuracy over audio-only methods.
Contribution
It proposes a novel multimodal framework, MADA, that combines audio, emotion, and textual semantics for improved abuse detection across multiple languages.
Findings
MADA outperforms audio-only approaches on the ADIMA dataset.
Leveraging multiple modalities yields 0.6%-5.2% accuracy gains across 10 languages.
Strong correlation between emotions and abusive behavior confirmed.
Abstract
The presence of abusive content on social media platforms is undesirable as it severely impedes healthy and safe social media interactions. While automatic abuse detection has been widely explored in textual domain, audio abuse detection still remains unexplored. In this paper, we attempt abuse detection in conversational audio from a multimodal perspective in a multilingual social media setting. Our key hypothesis is that along with the modelling of audio, incorporating discriminative information from other modalities can be highly beneficial for this task. Our proposed method, MADA, explicitly focuses on two modalities other than the audio itself, namely, the underlying emotions expressed in the abusive audio and the semantic information encapsulated in the corresponding textual form. Observations prove that MADA demonstrates gains over audio-only approaches on the ADIMA dataset. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Stalking, Cyberstalking, and Harassment · Advanced Malware Detection Techniques
