Developing Linguistic Patterns to Mitigate Inherent Human Bias in Offensive Language Detection
Toygar Tanyel, Besher Alkurdi, Serkan Ayvaz

TL;DR
This paper introduces a linguistic data augmentation method to reduce human bias in offensive language datasets, aiming to enhance the fairness and accuracy of deep learning models for detecting offensive content across multiple languages.
Contribution
It proposes a novel linguistic augmentation approach that mitigates human bias in offensive language datasets, improving model fairness and effectiveness.
Findings
Reduces bias in offensive language datasets
Improves detection accuracy across languages
Enhances fairness in offensive language classification
Abstract
With the proliferation of social media, there has been a sharp increase in offensive content, particularly targeting vulnerable groups, exacerbating social problems such as hatred, racism, and sexism. Detecting offensive language use is crucial to prevent offensive language from being widely shared on social media. However, the accurate detection of irony, implication, and various forms of hate speech on social media remains a challenge. Natural language-based deep learning models require extensive training with large, comprehensive, and labeled datasets. Unfortunately, manually creating such datasets is both costly and error-prone. Additionally, the presence of human-bias in offensive language datasets is a major concern for deep learning models. In this paper, we propose a linguistic data augmentation approach to reduce bias in labeling processes, which aims to mitigate the influence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection
