Attentive Fusion: A Transformer-based Approach to Multimodal Hate Speech   Detection

Atanu Mandal; Gargi Roy; Amit Barman; Indranil Dutta; Sudip Kumar; Naskar

arXiv:2401.10653·cs.CL·January 22, 2024·2 cites

Attentive Fusion: A Transformer-based Approach to Multimodal Hate Speech Detection

Atanu Mandal, Gargi Roy, Amit Barman, Indranil Dutta, Sudip Kumar, Naskar

PDF

Open Access 2 Repos

TL;DR

This paper introduces a Transformer-based multimodal approach with an Attentive Fusion layer to detect hate speech using both audio and text, significantly outperforming previous methods.

Contribution

The novel Attentive Fusion layer effectively combines audio and textual data within a Transformer framework for hate speech detection.

Findings

01

Achieved a macro F1 score of 0.927 on the test set.

02

Outperformed previous state-of-the-art techniques.

03

Demonstrated effectiveness of multimodal analysis in hate speech detection.

Abstract

With the recent surge and exponential growth of social media usage, scrutinizing social media content for the presence of any hateful content is of utmost importance. Researchers have been diligently working since the past decade on distinguishing between content that promotes hatred and content that does not. Traditionally, the main focus has been on analyzing textual content. However, recent research attempts have also commenced into the identification of audio-based content. Nevertheless, studies have shown that relying solely on audio or text-based content may be ineffective, as recent upsurge indicates that individuals often employ sarcasm in their speech and writing. To overcome these challenges, we present an approach to identify whether a speech promotes hate or not utilizing both audio and textual representations. Our methodology is based on the Transformer framework that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection

MethodsAttention Is All You Need · Absolute Position Encodings · Label Smoothing · Layer Normalization · Adam · Residual Connection · Dropout · Linear Layer · Multi-Head Attention · Byte Pair Encoding