DoDo Learning: DOmain-DemOgraphic Transfer in Language Models for   Detecting Abuse Targeted at Public Figures

Angus R. Williams; Hannah Rose Kirk; Liam Burke; Yi-Ling Chung; Ivan; Debono; Pica Johansson; Francesca Stevens; Jonathan Bright; and Scott A. Hale

arXiv:2307.16811·cs.CL·April 26, 2024

DoDo Learning: DOmain-DemOgraphic Transfer in Language Models for Detecting Abuse Targeted at Public Figures

Angus R. Williams, Hannah Rose Kirk, Liam Burke, Yi-Ling Chung, Ivan, Debono, Pica Johansson, Francesca Stevens, Jonathan Bright, and Scott A. Hale

PDF

Open Access 1 Repo

TL;DR

This paper investigates how language models trained on abuse detection data can transfer knowledge across different domains and demographics, aiming to improve generalisability and reduce labeling costs.

Contribution

It introduces a novel DODO dataset for abuse detection across domains and demographics and analyzes transferability and dataset similarity effects on model performance.

Findings

01

Small diverse datasets improve generalisation.

02

Models transfer more easily across demographics.

03

Cross-domain trained models are more generalisable.

Abstract

Public figures receive a disproportionate amount of abuse on social media, impacting their active participation in public life. Automated systems can identify abuse at scale but labelling training data is expensive, complex and potentially harmful. So, it is desirable that systems are efficient and generalisable, handling both shared and specific aspects of online abuse. We explore the dynamics of cross-group text classification in order to understand how well classifiers trained on one domain or demographic can transfer to others, with a view to building more generalisable abuse classifiers. We fine-tune language models to classify tweets targeted at public figures across DOmains (sport and politics) and DemOgraphics (women and men) using our novel DODO dataset, containing 28,000 labelled entries, split equally across four domain-demographic pairs. We find that (i) small amounts of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

turing-online-safety-codebase/dodo-learning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection