IYKYK: Using language models to decode extremist cryptolects
Christine de Kock, Arij Riabi, Zeerak Talat, Michael Sejr Schlichtkrull, Pranava Madhyastha, Ed Hovy

TL;DR
This study evaluates the effectiveness of current language models in detecting and decoding extremist cryptolects, revealing limitations and potential improvements through domain adaptation and specialized prompting, supported by new datasets.
Contribution
Introduces new datasets and lexicons for extremist cryptolects, and assesses the capabilities of various language models with insights for automated moderation.
Findings
General purpose LLMs struggle with extremist language detection.
Domain adaptation improves model performance significantly.
New datasets and lexicons are released for future research.
Abstract
Extremist groups develop complex in-group language, also referred to as cryptolects, to exclude or mislead outsiders. We investigate the ability of current language technologies to detect and interpret the cryptolects of two online extremist platforms. Evaluating eight models across six tasks, our results indicate that general purpose LLMs cannot consistently detect or decode extremist language. However, performance can be significantly improved by domain adaptation and specialised prompting techniques. These results provide important insights to inform the development and deployment of automated moderation technologies. We further develop and release novel labelled and unlabelled datasets, including 19.4M posts from extremist platforms and lexicons validated by human experts.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTerrorism, Counterterrorism, and Political Violence · Hate Speech and Cyberbullying Detection · Authorship Attribution and Profiling
