Cross-lingual hate speech detection based on multilingual   domain-specific word embeddings

Aym\'e Arango; Jorge P\'erez; Barbara Poblete

arXiv:2104.14728·cs.CL·May 3, 2021·1 cites

Cross-lingual hate speech detection based on multilingual domain-specific word embeddings

Aym\'e Arango, Jorge P\'erez, Barbara Poblete

PDF

Open Access

TL;DR

This paper introduces a novel multilingual hate speech detection method using domain-specific word embeddings, demonstrating improved cross-lingual classification without labeled data in target languages.

Contribution

It presents the first construction of multilingual domain-specific hate speech representations, outperforming previous general-purpose models in cross-lingual settings.

Findings

01

Domain-specific representations improve cross-lingual hate speech detection

02

Our model captures common hate speech patterns across languages

03

Outperforms previous approaches in most experimental setups

Abstract

Automatic hate speech detection in online social networks is an important open problem in Natural Language Processing (NLP). Hate speech is a multidimensional issue, strongly dependant on language and cultural factors. Despite its relevance, research on this topic has been almost exclusively devoted to English. Most supervised learning resources, such as labeled datasets and NLP tools, have been created for this same language. Considering that a large portion of users worldwide speak in languages other than English, there is an important need for creating efficient approaches for multilingual hate speech detection. In this work we propose to address the problem of multilingual hate speech detection from the perspective of transfer learning. Our goal is to determine if knowledge from one particular language can be used to classify other language, and to determine effective ways to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection