Interpretable Multi-Modal Hate Speech Detection

Prashanth Vijayaraghavan; Hugo Larochelle; Deb Roy

arXiv:2103.01616·cs.CL·March 3, 2021·24 cites

Interpretable Multi-Modal Hate Speech Detection

Prashanth Vijayaraghavan, Hugo Larochelle, Deb Roy

PDF

Open Access

TL;DR

This paper introduces a deep neural multi-modal model for hate speech detection that incorporates textual and socio-cultural context, providing both improved accuracy and interpretability to address social and legal concerns.

Contribution

It presents a novel multi-modal approach that captures semantics and socio-cultural context, enhancing interpretability and outperforming existing hate speech detection methods.

Findings

01

Model outperforms state-of-the-art approaches

02

Socio-cultural features are crucial for detecting hate clusters

03

Interpretability aids in understanding model decisions

Abstract

With growing role of social media in shaping public opinions and beliefs across the world, there has been an increased attention to identify and counter the problem of hate speech on social media. Hate speech on online spaces has serious manifestations, including social polarization and hate crimes. While prior works have proposed automated techniques to detect hate speech online, these techniques primarily fail to look beyond the textual content. Moreover, few attempts have been made to focus on the aspects of interpretability of such models given the social and legal implications of incorrect predictions. In this work, we propose a deep neural multi-modal model that can: (a) detect hate speech by effectively capturing the semantics of the text along with socio-cultural context in which a particular hate expression is made, and (b) provide interpretable insights into decisions of our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Adversarial Robustness in Machine Learning · Internet Traffic Analysis and Secure E-voting