Pegasus@Dravidian-CodeMix-HASOC2021: Analyzing Social Media Content for Detection of Offensive Text
Pawan Kalyan Jada, Konthala Yasaswini, Karthik Puranik, Anbukkarasi, Sampath, Sathiyaraj Thangasamy, Kingston Pal Thamburaj

TL;DR
This paper presents two Transformer-based methods for detecting offensive social media content in Tamil and Malayalam, addressing challenges of informal, unstructured, and code-mixed text, achieving top-tier results in a shared task.
Contribution
Introduction of two novel Transformer-based approaches for offensive content detection in Dravidian languages, with code availability and competitive performance.
Findings
Achieved top 8 placement in all shared task categories.
Effectively handled informal, unstructured, and code-mixed social media text.
Demonstrated the effectiveness of Transformer models for multilingual offensive content detection.
Abstract
To tackle the conundrum of detecting offensive comments/posts which are considerably informal, unstructured, miswritten and code-mixed, we introduce two inventive methods in this research paper. Offensive comments/posts on the social media platforms, can affect an individual, a group or underage alike. In order to classify comments/posts in two popular Dravidian languages, Tamil and Malayalam, as a part of the HASOC - DravidianCodeMix FIRE 2021 shared task, we employ two Transformer-based prototypes which successfully stood in the top 8 for all the tasks. The codes for our approach can be viewed and utilized.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Advanced Malware Detection Techniques · Spam and Phishing Detection
