Zero-shot Code-Mixed Offensive Span Identification through Rationale   Extraction

Manikandan Ravikiran; Bharathi Raja Chakravarthi

arXiv:2205.06119·cs.CL·May 13, 2022

Zero-shot Code-Mixed Offensive Span Identification through Rationale Extraction

Manikandan Ravikiran, Bharathi Raja Chakravarthi

PDF

Open Access 1 Repo

TL;DR

This paper explores zero-shot offensive span detection in code-mixed Tamil using transformer models and rationale extraction methods, achieving significant improvements with data augmentation and multilabel training.

Contribution

It introduces the application of LIME and Integrated Gradients for zero-shot span identification in code-mixed language, demonstrating their effectiveness with data augmentation and multilabel training.

Findings

01

LIME and IG baseline F1 scores of 26.35% and 44.83%.

02

Data augmentation and multilabel training improve F1 to over 47%.

03

Significant enhancement in zero-shot offensive span detection performance.

Abstract

This paper investigates the effectiveness of sentence-level transformers for zero-shot offensive span identification on a code-mixed Tamil dataset. More specifically, we evaluate rationale extraction methods of Local Interpretable Model Agnostic Explanations (LIME) \cite{DBLP:conf/kdd/Ribeiro0G16} and Integrated Gradients (IG) \cite{DBLP:conf/icml/SundararajanTY17} for adapting transformer based offensive language classification models for zero-shot offensive span identification. To this end, we find that LIME and IG show baseline $F_{1}$ of 26.35\% and 44.83\%, respectively. Besides, we study the effect of data set size and training process on the overall accuracy of span identification. As a result, we find both LIME and IG to show significant improvement with Masked Data Augmentation and Multilabel Training, with $F_{1}$ of 50.23\% and 47.38\% respectively. \textit{Disclaimer : This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

manikandan-ravikiran/zero-shot-offensive-span
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Natural Language Processing Techniques · Text Readability and Simplification

MethodsLocal Interpretable Model-Agnostic Explanations