Constructing interval variables via faceted Rasch measurement and   multitask deep learning: a hate speech application

Chris J. Kennedy; Geoff Bacon; Alexander Sahn; Claudia von Vacano

arXiv:2009.10277·cs.CL·September 23, 2020·35 cites

Constructing interval variables via faceted Rasch measurement and multitask deep learning: a hate speech application

Chris J. Kennedy, Geoff Bacon, Alexander Sahn, Claudia von Vacano

PDF

Open Access 2 Repos 3 Models 5 Datasets

TL;DR

This paper introduces a novel approach combining faceted Rasch measurement with multitask deep learning to produce debiased, continuous hate speech scores from social media comments, improving interpretability and accuracy.

Contribution

It develops a new method integrating IRT with deep learning for continuous variable construction and applies it to hate speech measurement with enhanced interpretability.

Findings

01

Achieved significant accuracy improvements over Perspective API.

02

Effectively debiased and calibrated hate speech scores.

03

Provided interpretable model linking predictions to constituent components.

Abstract

We propose a general method for measuring complex variables on a continuous, interval spectrum by combining supervised deep learning with the Constructing Measures approach to faceted Rasch item response theory (IRT). We decompose the target construct, hate speech in our case, into multiple constituent components that are labeled as ordinal survey items. Those survey responses are transformed via IRT into a debiased, continuous outcome measure. Our method estimates the survey interpretation bias of the human labelers and eliminates that influence on the generated continuous measure. We further estimate the response quality of each labeler using faceted IRT, allowing responses from low-quality labelers to be removed. Our faceted Rasch scaling procedure integrates naturally with a multitask deep learning architecture for automated prediction on new data. The ratings on the theorized…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Topic Modeling · Spam and Phishing Detection

MethodsLinear Layer · Adam · Softmax · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Weight Decay · Dropout · Linear Warmup With Linear Decay · RoBERTa · Attention Dropout