Threadle: A Memory-Efficient Network Storage and Query Engine for Large, Multilayer, and Mixed-mode Networks
Carl Nordlund, Yukun Jiao

TL;DR
Threadle is a memory-efficient, high-performance network storage and query engine capable of handling very large, multilayer, mixed-mode networks with billions of edges, using a pseudo-projection approach to avoid memory-intensive operations.
Contribution
The paper introduces Threadle, a novel engine that efficiently manages two-mode networks at scale without materializing projections, supporting multilayer mixed-mode networks and extensive querying capabilities.
Findings
Stores a 20 million node network with 8 trillion projected edges in ~20 GB RAM
Achieves over 2000:1 compression ratio compared to materialized projection
Provides native support for multilayer, mixed-mode networks and extensive command-line tools
Abstract
We present Threadle, an open-source, high-performance, and memory-efficient network storage and query engine written in C#. Designed for working with full-population networks derived from administrative register data, which represent very large, multilayer, mixed-mode networks with millions of nodes and billions of edges, Threadle addresses a fundamental limitation of existing network libraries: the inability to efficiently handle two-mode (bipartite) data at scale. Threadle's core innovation is a pseudo-projection approach that allows two-mode layers to be queried as if they were projected into one-mode form, without ever materializing the memory-prohibitive projection. We demonstrate that a network with 20 million nodes containing layers equivalent to 8 trillion projected edges can be stored in approximately 20 GB of RAM -- a compression ratio exceeding 2000:1 compared to materialized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Parallel Computing and Optimization Techniques · Software-Defined Networks and 5G
