Towards Dark Jargon Interpretation in Underground Forums
Dominic Seyler, Wei Liu, XiaoFeng Wang, ChengXiang Zhai

TL;DR
This paper introduces a novel method for automatically identifying and interpreting dark jargon in underground forums by mapping obscure terms to their benign equivalents using interpretable probabilistic representations.
Contribution
The work formalizes dark jargon interpretation as a mapping problem and proposes a new method utilizing shared vocabulary distributions, outperforming existing approaches on simulated data and detecting dark jargons in real forums.
Findings
Effective dark jargon identification on simulated data
Successful detection of dark jargons in real-world underground forums
Outperforms related methods in accuracy
Abstract
Dark jargons are benign-looking words that have hidden, sinister meanings and are used by participants of underground forums for illicit behavior. For example, the dark term "rat" is often used in lieu of "Remote Access Trojan". In this work we present a novel method towards automatically identifying and interpreting dark jargons. We formalize the problem as a mapping from dark words to "clean" words with no hidden meaning. Our method makes use of interpretable representations of dark and clean words in the form of probability distributions over a shared vocabulary. In our experiments we show our method to be effective in terms of dark jargon identification, as it outperforms another related method on simulated data. Using manual evaluation, we show that our method is able to detect dark jargons in a real-world underground forum dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
