Optimize_Prime@DravidianLangTech-ACL2022: Abusive Comment Detection in   Tamil

Shantanu Patankar; Omkar Gokhale; Onkar Litake; Aditya Mandke; Dipali; Kadam

arXiv:2204.09675·cs.CL·April 22, 2022·1 cites

Optimize_Prime@DravidianLangTech-ACL2022: Abusive Comment Detection in Tamil

Shantanu Patankar, Omkar Gokhale, Onkar Litake, Aditya Mandke, Dipali, Kadam

PDF

Open Access

TL;DR

This paper addresses abusive comment detection in Tamil and code-mixed Tamil-English social media comments using ensemble, RNN, and Transformer models, achieving macro F1 scores around 0.43-0.45.

Contribution

It introduces an approach combining ensemble, RNN, and Transformer models for abusive comment detection in low-resource Tamil and code-mixed data, with state-of-the-art results.

Findings

01

MuRIL and XLM-RoBERTA achieved 0.43 F1 score on Tamil data.

02

MuRIL and M-BERT achieved 0.45 F1 score on code-mixed data.

03

Ensemble and Transformer models improve detection accuracy.

Abstract

This paper tries to address the problem of abusive comment detection in low-resource indic languages. Abusive comments are statements that are offensive to a person or a group of people. These comments are targeted toward individuals belonging to specific ethnicities, genders, caste, race, sexuality, etc. Abusive Comment Detection is a significant problem, especially with the recent rise in social media users. This paper presents the approach used by our team - Optimize_Prime, in the ACL 2022 shared task "Abusive Comment Detection in Tamil." This task detects and classifies YouTube comments in Tamil and Tamil- English Codemixed format into multiple categories. We have used three methods to optimize our results: Ensemble models, Recurrent Neural Networks, and Transformers. In the Tamil data, MuRIL and XLM-RoBERTA were our best performing models with a macro-averaged f1 score of 0.43.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Sentiment Analysis and Opinion Mining