Multilingual Abusiveness Identification on Code-Mixed Social Media Text
Ekagra Ranjan, Naman Poddar

TL;DR
This paper presents a method for detecting abusive content in multilingual, code-mixed social media text, specifically focusing on Indic languages, addressing challenges like transliteration and script variation.
Contribution
It introduces an approach tailored for multilingual, code-mixed social media data, which can be extended to other languages, filling a gap in non-English content analysis.
Findings
Effective abusiveness detection on Moj dataset with Indic languages
Addresses challenges of code-mixing, transliteration, and script variation
Method can be generalized to other multilingual social media contexts
Abstract
Social Media platforms have been seeing adoption and growth in their usage over time. This growth has been further accelerated with the lockdown in the past year when people's interaction, conversation, and expression were limited physically. It is becoming increasingly important to keep the platform safe from abusive content for better user experience. Much work has been done on English social media content but text analysis on non-English social media is relatively underexplored. Non-English social media content have the additional challenges of code-mixing, transliteration and using different scripture in same sentence. In this work, we propose an approach for abusiveness identification on the multilingual Moj dataset which comprises of Indic languages. Our approach tackles the common challenges of non-English social media content and can be extended to other languages as well.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Text Readability and Simplification · Natural Language Processing Techniques
