Lightweight Toxicity Detection in Spoken Language: A Transformer-based   Approach for Edge Devices

Ahlam Husni Abu Nada; Siddique Latif; and Junaid Qadir

arXiv:2304.11408·cs.SD·April 25, 2023·1 cites

Lightweight Toxicity Detection in Spoken Language: A Transformer-based Approach for Edge Devices

Ahlam Husni Abu Nada, Siddique Latif, and Junaid Qadir

PDF

Open Access

TL;DR

This paper introduces a lightweight, transformer-based speech toxicity detection model optimized for edge devices, achieving high accuracy while significantly reducing model size and computational requirements.

Contribution

It presents the first end-to-end speech-based toxicity detection model suitable for deployment on physical edge devices using a lightweight transformer architecture.

Findings

01

Achieves 90.3% macro F1-score and 88% accuracy on benchmark datasets.

02

Quantization reduces model size by 4x and RAM usage by 3.3% with minimal accuracy loss.

03

Knowledge distillation decreases model size by 3.7x and inference time by 2x, with some accuracy reduction.

Abstract

Toxicity is a prevalent social behavior that involves the use of hate speech, offensive language, bullying, and abusive speech. While text-based approaches for toxicity detection are common, there is limited research on processing speech signals in the physical world. Detecting toxicity in the physical world is challenging due to the difficulty of integrating AI-capable computers into the environment. We propose a lightweight transformer model based on wav2vec2.0 and optimize it using techniques such as quantization and knowledge distillation. Our model uses multitask learning and achieves an average macro F1-score of 90.3\% and a weighted accuracy of 88\%, outperforming state-of-the-art methods on DeToxy-B and a public dataset. Our results show that quantization reduces the model size by almost 4 times and RAM usage by 3.3\%, with only a 1\% F1 score decrease. Knowledge distillation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Adversarial Robustness in Machine Learning