BERT-Beta: A Proactive Probabilistic Approach to Text Moderation

Fei Tan; Yifan Hu; Kevin Yen; Changwei Hu

arXiv:2109.08805·cs.CL·September 21, 2021·1 cites

BERT-Beta: A Proactive Probabilistic Approach to Text Moderation

Fei Tan, Yifan Hu, Kevin Yen, Changwei Hu

PDF

Open Access

TL;DR

This paper introduces BERT-Beta, a probabilistic model that predicts the likelihood of a text attracting toxic comments, enhancing proactive moderation with interpretability and new scaling insights.

Contribution

It proposes the novel concept of text toxicity propensity and applies beta regression for probabilistic forecasting in content moderation.

Findings

01

Beta regression effectively models toxicity propensity.

02

The explanation method improves interpretability of moderation decisions.

03

Scaling mechanism provides additional insights into the linear model.

Abstract

Text moderation for user generated content, which helps to promote healthy interaction among users, has been widely studied and many machine learning models have been proposed. In this work, we explore an alternative perspective by augmenting reactive reviews with proactive forecasting. Specifically, we propose a new concept {\it text toxicity propensity} to characterize the extent to which a text tends to attract toxic comments. Beta regression is then introduced to do the probabilistic modeling, which is demonstrated to function well in comprehensive experiments. We also propose an explanation method to communicate the model decision clearly. Both propensity scoring and interpretation benefit text moderation in a novel manner. Finally, the proposed scaling mechanism for the linear model offers useful insights beyond this work.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Software Engineering Research · Spam and Phishing Detection