From BERT to Qwen: Hate Detection across architectures
Ariadna Mon, Sa\'ul Fenollosa, Jon Lecumberri

TL;DR
This paper compares the effectiveness of traditional bidirectional transformer encoders and large autoregressive LLMs in detecting hate speech on online platforms, assessing whether increased scale improves real-world performance.
Contribution
It provides a benchmark analysis of classic transformers versus next-generation LLMs for hate speech detection on curated online interaction datasets.
Findings
Large LLMs do not significantly outperform traditional encoders in hate speech detection.
Model architecture impacts detection accuracy more than scale alone.
Benchmark results highlight strengths and limitations of each model family.
Abstract
Online platforms struggle to curb hate speech without over-censoring legitimate discourse. Early bidirectional transformer encoders made big strides, but the arrival of ultra-large autoregressive LLMs promises deeper context-awareness. Whether this extra scale actually improves practical hate-speech detection on real-world text remains unverified. Our study puts this question to the test by benchmarking both model families, classic encoders and next-generation LLMs, on curated corpora of online interactions for hate-speech detection (Hate or No Hate).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection
