ConvAbuse: Data, Analysis, and Benchmarks for Nuanced Abuse Detection in   Conversational AI

Amanda Cercas Curry; Gavin Abercrombie; Verena Rieser

arXiv:2109.09483·cs.CL·September 21, 2021·5 cites

ConvAbuse: Data, Analysis, and Benchmarks for Nuanced Abuse Detection in Conversational AI

Amanda Cercas Curry, Gavin Abercrombie, Verena Rieser

PDF

Open Access 1 Repo

TL;DR

This paper introduces ConvAbuse, a nuanced dataset of abusive language towards conversational AI, revealing distinct abuse patterns and benchmarking existing models with room for improvement.

Contribution

It provides the first detailed corpus study of abuse in conversational AI with fine-grained annotations and benchmarks current models on this new dataset.

Findings

01

Abuse patterns differ significantly from other datasets.

02

Sexually tinted aggression is prevalent towards AI systems.

03

Existing models achieve F1 scores below 90%, indicating room for improvement.

Abstract

We present the first English corpus study on abusive language towards three conversational AI systems gathered "in the wild": an open-domain social bot, a rule-based chatbot, and a task-based system. To account for the complexity of the task, we take a more `nuanced' approach where our ConvAI dataset reflects fine-grained notions of abuse, as well as views from multiple expert annotators. We find that the distribution of abuse is vastly different compared to other commonly used datasets, with more sexually tinted aggression towards the virtual persona of these systems. Finally, we report results from bench-marking existing models against this data. Unsurprisingly, we find that there is substantial room for improvement with F1 scores below 90%.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

amandacurry/convabuse
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Sexuality, Behavior, and Technology · Cybercrime and Law Enforcement Studies