Examining Temporal Bias in Abusive Language Detection
Mali Jin, Yida Mu, Diana Maynard, Kalina Bontcheva

TL;DR
This paper investigates how temporal bias affects abusive language detection models across different languages, revealing significant performance drops over time and analyzing language evolution to inform mitigation strategies.
Contribution
It provides a comprehensive analysis of temporal bias in abusive language detection across multiple languages and explores methods to mitigate this bias.
Findings
Models trained on historical data perform worse over time.
Temporal bias significantly impacts abusive language detection accuracy.
Linguistic analysis reveals language evolution contributes to performance decline.
Abstract
The use of abusive language online has become an increasingly pervasive problem that damages both individuals and society, with effects ranging from psychological harm right through to escalation to real-life violence and even death. Machine learning models have been developed to automatically detect abusive language, but these models can suffer from temporal bias, the phenomenon in which topics, language use or social norms change over time. This study aims to investigate the nature and impact of temporal bias in abusive language detection across various languages and explore mitigation methods. We evaluate the performance of models on abusive data sets from different time periods. Our results demonstrate that temporal bias is a significant challenge for abusive language detection, with models trained on historical data showing a significant drop in performance over time. We also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Authorship Attribution and Profiling
