Towards Dark Jargon Interpretation in Underground Forums

Dominic Seyler; Wei Liu; XiaoFeng Wang; ChengXiang Zhai

arXiv:2011.03011·cs.CR·January 12, 2021

Towards Dark Jargon Interpretation in Underground Forums

Dominic Seyler, Wei Liu, XiaoFeng Wang, ChengXiang Zhai

PDF

TL;DR

This paper introduces a novel method for automatically identifying and interpreting dark jargon in underground forums by mapping obscure terms to their benign equivalents using interpretable probabilistic representations.

Contribution

The work formalizes dark jargon interpretation as a mapping problem and proposes a new method utilizing shared vocabulary distributions, outperforming existing approaches on simulated data and detecting dark jargons in real forums.

Findings

01

Effective dark jargon identification on simulated data

02

Successful detection of dark jargons in real-world underground forums

03

Outperforms related methods in accuracy

Abstract

Dark jargons are benign-looking words that have hidden, sinister meanings and are used by participants of underground forums for illicit behavior. For example, the dark term "rat" is often used in lieu of "Remote Access Trojan". In this work we present a novel method towards automatically identifying and interpreting dark jargons. We formalize the problem as a mapping from dark words to "clean" words with no hidden meaning. Our method makes use of interpretable representations of dark and clean words in the form of probability distributions over a shared vocabulary. In our experiments we show our method to be effective in terms of dark jargon identification, as it outperforms another related method on simulated data. Using manual evaluation, we show that our method is able to detect dark jargons in a real-world underground forum dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.