Critical Perspectives: A Benchmark Revealing Pitfalls in PerspectiveAPI

Lorena Piedras; Lucas Rosenblatt; Julia Wilkins

arXiv:2301.01874·cs.CL·January 6, 2023

Critical Perspectives: A Benchmark Revealing Pitfalls in PerspectiveAPI

Lorena Piedras, Lucas Rosenblatt, Julia Wilkins

PDF

Open Access 1 Repo

TL;DR

This paper introduces SASS, a new benchmark to evaluate toxicity detection tools like PerspectiveAPI, revealing significant shortcomings and emphasizing the need to critically assess such tools to prevent harms.

Contribution

We propose SASS, a novel benchmark for evaluating toxicity detection models, and demonstrate that PerspectiveAPI has notable limitations on this new challenging dataset.

Findings

01

PerspectiveAPI shows shortcomings on SASS in multiple toxicity categories

02

SASS uncovers previously undetected toxic language

03

Evaluation of GPT-3 prompts reveals performance gaps

Abstract

Detecting "toxic" language in internet content is a pressing social and technical challenge. In this work, we focus on PERSPECTIVE from Jigsaw, a state-of-the-art tool that promises to score the "toxicity" of text, with a recent model update that claims impressive results (Lees et al., 2022). We seek to challenge certain normative claims about toxic language by proposing a new benchmark, Selected Adversarial SemanticS, or SASS. We evaluate PERSPECTIVE on SASS, and compare to low-effort alternatives, like zero-shot and few-shot GPT-3 prompt models, in binary classification settings. We find that PERSPECTIVE exhibits troubling shortcomings across a number of our toxicity categories. SASS provides a new tool for evaluating performance on previously undetected toxic language that avoids common normative pitfalls. Our work leads us to emphasize the importance of questioning assumptions made…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lurosenb/sass
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Adversarial Robustness in Machine Learning · Software Engineering Research

MethodsMulti-Head Attention · Attention Is All You Need · Cosine Annealing · Linear Warmup With Cosine Annealing · Linear Layer · Layer Normalization · Softmax · Adam · Refunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines