Introducing A Dark Web Archival Framework
Justin F. Brunelle, Ryan Farley, Grant Atkins, Trevor Bostic, Marites, Hendrix, Zak Zebrowski

TL;DR
This paper introduces a framework for large-scale dark web archiving by adapting surface web tools and techniques, demonstrating its feasibility through a prototype that captures, stores, and replays dark web content.
Contribution
It presents a novel framework that adapts existing surface web archiving tools for effective dark web archiving, filling a gap in institutional archiving efforts.
Findings
The framework successfully captures dark web content using adapted tools.
Dark web content can be stored and replayed with existing archiving technologies.
The prototype demonstrates the practical viability of the proposed approach.
Abstract
We present a framework for web-scale archiving of the dark web. While commonly associated with illicit and illegal activity, the dark web provides a way to privately access web information. This is a valuable and socially beneficial tool to global citizens, such as those wishing to access information while under oppressive political regimes that work to limit information availability. However, little institutional archiving is performed on the dark web (limited to the Archive.is dark web presence, a page-at-a-time archiver). We use surface web tools, techniques, and procedures (TTPs) and adapt them for archiving the dark web. We demonstrate the viability of our framework in a proof-of-concept and narrowly scoped prototype, implemented with the following lightly adapted open source tools: the Brozzler crawler for capture, WARC file for storage, and pywb for replay. Using these tools, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Web Data Mining and Analysis · Digital and Cyber Forensics
