HABERTOR: An Efficient and Effective Deep Hatespeech Detector

Thanh Tran; Yifan Hu; Changwei Hu; Kevin Yen; Fei Tan; Kyumin Lee,; Serim Park

arXiv:2010.08865·cs.CL·October 20, 2020

HABERTOR: An Efficient and Effective Deep Hatespeech Detector

Thanh Tran, Yifan Hu, Changwei Hu, Kevin Yen, Fei Tan, Kyumin Lee,, Serim Park

PDF

TL;DR

HABERTOR is a novel deep hatespeech detection model that improves accuracy, efficiency, and robustness by customizing BERT with quaternion components, multi-source ensemble heads, and adversarial training on large-scale datasets.

Contribution

It introduces a BERT-based hatespeech detector with quaternion factorization, multi-source ensemble heads, and adversarial training, achieving superior performance and efficiency over existing methods.

Findings

01

Outperforms 15 state-of-the-art hatespeech detection methods.

02

4-5 times faster training and inference than BERT.

03

Uses less than 1% of the words for pre-training.

Abstract

We present our HABERTOR model for detecting hatespeech in large scale user-generated content. Inspired by the recent success of the BERT model, we propose several modifications to BERT to enhance the performance on the downstream hatespeech classification task. HABERTOR inherits BERT's architecture, but is different in four aspects: (i) it generates its own vocabularies and is pre-trained from the scratch using the largest scale hatespeech dataset; (ii) it consists of Quaternion-based factorized components, resulting in a much smaller number of parameters, faster training and inferencing, as well as less memory usage; (iii) it uses our proposed multi-source ensemble heads with a pooling layer for separate input sources, to further enhance its effectiveness; and (iv) it uses a regularized adversarial training with our proposed fine-grained and adaptive noise magnitude to enhance its…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · WordPiece · Adam · Softmax · Layer Normalization · Dense Connections · Multi-Head Attention · Refunds@Expedia|||How do I get a full refund from Expedia? · Dropout · Linear Warmup With Linear Decay