Dizzy: Large-Scale Crawling and Analysis of Onion Services
Yazan Boshmaf, Isuranga Perera, Udesh Kumarasinghe, Sajitha Liyanage,, Husam Al Jawaheri

TL;DR
Dizzy is an open-source system that scales up the crawling and analysis of onion services, revealing insights into their content, reliability, and network structure, which are crucial for darkweb research and law enforcement.
Contribution
The paper introduces Dizzy, a novel large-scale crawling and analysis framework for onion services, enabling comprehensive darkweb research beyond previous limited datasets.
Findings
Onion services exhibit high churn rates and are often illicit.
Cryptocurrency usage in onion services is increasing.
The onion web graph is tightly-knit but topologically distinct from the regular web.
Abstract
With nearly 2.5m users, onion services have become the prominent part of the darkweb. Over the last five years alone, the number of onion domains has increased 20x, reaching more than 700k unique domains in January 2022. As onion services host various types of illicit content, they have become a valuable resource for darkweb research and an integral part of e-crime investigation and threat intelligence. However, this content is largely un-indexed by today's search engines and researchers have to rely on outdated or manually-collected datasets that are limited in scale, scope, or both. To tackle this problem, we built Dizzy: An open-source crawling and analysis system for onion services. Dizzy implements novel techniques to explore, update, check, and classify onion services at scale, without overwhelming the Tor network. We deployed Dizzy in April 2021 and used it to analyze more than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Spam and Phishing Detection · Web Data Mining and Analysis
