IDTraffickers: An Authorship Attribution Dataset to link and connect Potential Human-Trafficking Operations on Text Escort Advertisements
Vageesh Saxena, Benjamin Bashpole, Gijs Van Dijck, Gerasimos Spanakis

TL;DR
This paper introduces IDTraffickers, a large dataset of escort ads and vendor labels, along with benchmarks for authorship attribution, to aid law enforcement in linking potential human trafficking operations online.
Contribution
The creation of the IDTraffickers dataset and the establishment of baseline models for authorship attribution in the context of human trafficking investigations.
Findings
DeCLUTR-small model achieved macro-F1 of 0.8656 in closed-set classification.
Authorship verification achieved mean r-precision of 0.8852 in open-set ranking.
Dataset and benchmarks will be publicly released for further research.
Abstract
Human trafficking (HT) is a pervasive global issue affecting vulnerable individuals, violating their fundamental human rights. Investigations reveal that a significant number of HT cases are associated with online advertisements (ads), particularly in escort markets. Consequently, identifying and connecting HT vendors has become increasingly challenging for Law Enforcement Agencies (LEAs). To address this issue, we introduce IDTraffickers, an extensive dataset consisting of 87,595 text ads and 5,244 vendor labels to enable the verification and identification of potential HT vendors on online escort markets. To establish a benchmark for authorship identification, we train a DeCLUTR-small model, achieving a macro-F1 score of 0.8656 in a closed-set classification environment. Next, we leverage the style representations extracted from the trained classifier to conduct authorship…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling · Cybercrime and Law Enforcement Studies · Names, Identity, and Discrimination Research
