Pegasus@Dravidian-CodeMix-HASOC2021: Analyzing Social Media Content for   Detection of Offensive Text

Pawan Kalyan Jada; Konthala Yasaswini; Karthik Puranik; Anbukkarasi; Sampath; Sathiyaraj Thangasamy; Kingston Pal Thamburaj

arXiv:2111.09836·cs.CL·November 19, 2021

Pegasus@Dravidian-CodeMix-HASOC2021: Analyzing Social Media Content for Detection of Offensive Text

Pawan Kalyan Jada, Konthala Yasaswini, Karthik Puranik, Anbukkarasi, Sampath, Sathiyaraj Thangasamy, Kingston Pal Thamburaj

PDF

Open Access

TL;DR

This paper presents two Transformer-based methods for detecting offensive social media content in Tamil and Malayalam, addressing challenges of informal, unstructured, and code-mixed text, achieving top-tier results in a shared task.

Contribution

Introduction of two novel Transformer-based approaches for offensive content detection in Dravidian languages, with code availability and competitive performance.

Findings

01

Achieved top 8 placement in all shared task categories.

02

Effectively handled informal, unstructured, and code-mixed social media text.

03

Demonstrated the effectiveness of Transformer models for multilingual offensive content detection.

Abstract

To tackle the conundrum of detecting offensive comments/posts which are considerably informal, unstructured, miswritten and code-mixed, we introduce two inventive methods in this research paper. Offensive comments/posts on the social media platforms, can affect an individual, a group or underage alike. In order to classify comments/posts in two popular Dravidian languages, Tamil and Malayalam, as a part of the HASOC - DravidianCodeMix FIRE 2021 shared task, we employ two Transformer-based prototypes which successfully stood in the top 8 for all the tasks. The codes for our approach can be viewed and utilized.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Advanced Malware Detection Techniques · Spam and Phishing Detection