CRATOR: a Dark Web Crawler

Daniel De Pascale; Giuseppe Cascavilla; Damian A. Tamburri; Willem-Jan; Van Den Heuvel

arXiv:2405.06356·cs.CR·May 13, 2024

CRATOR: a Dark Web Crawler

Daniel De Pascale, Giuseppe Cascavilla, Damian A. Tamburri, Willem-Jan, Van Den Heuvel

PDF

Open Access

TL;DR

This paper presents CRATOR, a dark web crawler capable of efficiently extracting pages with security protocols while maintaining anonymity, useful for cybersecurity and threat intelligence applications.

Contribution

The study introduces a novel dark web crawler that handles security measures like captchas and employs techniques for anonymity and detection avoidance.

Findings

01

Effective extraction of pages with security protocols

02

Maintains anonymity through user-agent rotation and proxies

03

Demonstrates high coverage and robustness

Abstract

Dark web crawling is a complex process that involves specific methodologies and techniques to navigate the Tor network and extract data from hidden services. This study proposes a general dark web crawler designed to extract pages handling security protocols, such as captchas, efficiently. Our approach uses a combination of seed URL lists, link analysis, and scanning to discover new content. We also incorporate methods for user-agent rotation and proxy usage to maintain anonymity and avoid detection. We evaluate the effectiveness of our crawler using metrics such as coverage, performance and robustness. Our results demonstrate that our crawler effectively extracts pages handling security protocols while maintaining anonymity and avoiding detection. Our proposed dark web crawler can be used for various applications, including threat intelligence, cybersecurity, and online investigations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsWeb Data Mining and Analysis · Advanced Malware Detection Techniques