SJ_AJ@DravidianLangTech-EACL2021: Task-Adaptive Pre-Training of   Multilingual BERT models for Offensive Language Identification

Sai Muralidhar Jayanthi; Akshat Gupta

arXiv:2102.01051·cs.CL·March 15, 2021·21 cites

SJ_AJ@DravidianLangTech-EACL2021: Task-Adaptive Pre-Training of Multilingual BERT models for Offensive Language Identification

Sai Muralidhar Jayanthi, Akshat Gupta

PDF

Open Access 1 Repo

TL;DR

This paper introduces an ensemble approach using task-adaptive pre-training of multilingual BERT models for offensive language identification in Dravidian languages, achieving top rankings in the shared task.

Contribution

It demonstrates the effectiveness of task-adaptive pre-training combined with ensemble methods for multilingual offensive language detection.

Findings

01

Ranked 1st for Kannada

02

Ranked 2nd for Malayalam

03

Ranked 3rd for Tamil

Abstract

In this paper we present our submission for the EACL 2021-Shared Task on Offensive Language Identification in Dravidian languages. Our final system is an ensemble of mBERT and XLM-RoBERTa models which leverage task-adaptive pre-training of multilingual BERT models with a masked language modeling objective. Our system was ranked 1st for Kannada, 2nd for Malayalam and 3rd for Tamil.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

murali1996/eacl2021-OffensEval-Dravidian
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Natural Language Processing Techniques · Interpreting and Communication in Healthcare

MethodsLinear Layer · mBERT · Refunds@Expedia|||How do I get a full refund from Expedia? · Softmax · Attention Is All You Need · Dense Connections · Residual Connection · WordPiece · Attention Dropout · Adam