Countering Malicious Content Moderation Evasion in Online Social Networks: Simulation and Detection of Word Camouflage
\'Alvaro Huertas-Garc\'ia, Alejandro Mart\'in, Javier Huertas, Tato, David Camacho

TL;DR
This paper introduces a multilingual Transformer-based approach and a simulation tool to detect and generate camouflaged malicious content in social networks, enhancing moderation capabilities against evolving evasion tactics.
Contribution
It presents a novel multilingual detection model and a simulation tool for content evasion techniques, addressing the challenge of linguistic camouflage in online moderation.
Findings
The multilingual NER model achieved an overall weighted F1 score of 0.8795.
The simulation tool 'pyleetspeak' effectively generates camouflaged content for testing.
The approach improves detection of diverse and mixed camouflage techniques.
Abstract
Content moderation is the process of screening and monitoring user-generated content online. It plays a crucial role in stopping content resulting from unacceptable behaviors such as hate speech, harassment, violence against specific groups, terrorism, racism, xenophobia, homophobia, or misogyny, to mention some few, in Online Social Platforms. These platforms make use of a plethora of tools to detect and manage malicious information; however, malicious actors also improve their skills, developing strategies to surpass these barriers and continuing to spread misleading information. Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems. In response to this recent ongoing issue, this paper presents an innovative approach to address this linguistic trend in social networks through the simulation of different content evasion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Spam and Phishing Detection · Advanced Malware Detection Techniques
MethodsMulti-Head Attention · Attention Is All You Need · Absolute Position Encodings · Linear Layer · Adam · Layer Normalization · Softmax · Byte Pair Encoding · Residual Connection · Label Smoothing
