Exploring Hate Speech Detection in Multimodal Publications

Raul Gomez; Jaume Gibert; Lluis Gomez; Dimosthenis Karatzas

arXiv:1910.03814·cs.CV·October 10, 2019

Exploring Hate Speech Detection in Multimodal Publications

Raul Gomez, Jaume Gibert, Lluis Gomez, Dimosthenis Karatzas

PDF

1 Repo

TL;DR

This paper investigates hate speech detection in multimodal social media posts combining text and images, introducing a large dataset and analyzing the effectiveness of joint models versus unimodal approaches.

Contribution

It presents MMHS150K, a large annotated dataset for multimodal hate speech detection, and compares multimodal models with unimodal ones, highlighting current limitations.

Findings

01

Images aid hate speech detection but do not outperform text-only models

02

Multimodal models currently underperform compared to unimodal text models

03

The paper discusses challenges and opens avenues for future research

Abstract

In this work we target the problem of hate speech detection in multimodal publications formed by a text and an image. We gather and annotate a large scale dataset from Twitter, MMHS150K, and propose different models that jointly analyze textual and visual information for hate speech detection, comparing them with unimodal detection. We provide quantitative and qualitative results and analyze the challenges of the proposed task. We find that, even though images are useful for the hate speech detection task, current multimodal models cannot outperform models analyzing only text. We discuss why and open the field and the dataset for further research.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gombru/multi-modal-hate-speech
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.