Abusive Speech Detection in Indic Languages Using Acoustic Features
Anika A. Spiesberger, Andreas Triantafyllopoulos, Iosif Tsangko,, Bj\"orn W. Schuller

TL;DR
This paper explores detecting abusive speech in Indic languages using acoustic and prosodic features, demonstrating that language-independent audio cues can effectively identify abusive content across multiple languages.
Contribution
It introduces a novel approach using acoustic features for multilingual abusive speech detection, bypassing the need for language-specific training.
Findings
Acoustic features can effectively classify abusive speech across ten Indic languages.
Multilingual and cross-lingual models perform comparably to language-specific models.
Key acoustic and prosodic features influencing classification are identified.
Abstract
Abusive content in online social networks is a well-known problem that can cause serious psychological harm and incite hatred. The ability to upload audio data increases the importance of developing methods to detect abusive content in speech recordings. However, simply transferring the mechanisms from written abuse detection would ignore relevant information such as emotion and tone. In addition, many current algorithms require training in the specific language for which they are being used. This paper proposes to use acoustic and prosodic features to classify abusive content. We used the ADIMA data set, which contains recordings from ten Indic languages, and trained different models in multilingual and cross-lingual settings. Our results show that it is possible to classify abusive and non-abusive content using only acoustic and prosodic features. The most important and influential…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
