Threadle: A Memory-Efficient Network Storage and Query Engine for Large, Multilayer, and Mixed-mode Networks

Carl Nordlund; Yukun Jiao

arXiv:2603.04446·cs.NI·March 6, 2026

Threadle: A Memory-Efficient Network Storage and Query Engine for Large, Multilayer, and Mixed-mode Networks

Carl Nordlund, Yukun Jiao

PDF

Open Access

TL;DR

Threadle is a memory-efficient, high-performance network storage and query engine capable of handling very large, multilayer, mixed-mode networks with billions of edges, using a pseudo-projection approach to avoid memory-intensive operations.

Contribution

The paper introduces Threadle, a novel engine that efficiently manages two-mode networks at scale without materializing projections, supporting multilayer mixed-mode networks and extensive querying capabilities.

Findings

01

Stores a 20 million node network with 8 trillion projected edges in ~20 GB RAM

02

Achieves over 2000:1 compression ratio compared to materialized projection

03

Provides native support for multilayer, mixed-mode networks and extensive command-line tools

Abstract

We present Threadle, an open-source, high-performance, and memory-efficient network storage and query engine written in C#. Designed for working with full-population networks derived from administrative register data, which represent very large, multilayer, mixed-mode networks with millions of nodes and billions of edges, Threadle addresses a fundamental limitation of existing network libraries: the inability to efficiently handle two-mode (bipartite) data at scale. Threadle's core innovation is a pseudo-projection approach that allows two-mode layers to be queried as if they were projected into one-mode form, without ever materializing the memory-prohibitive projection. We demonstrate that a network with 20 million nodes containing layers equivalent to 8 trillion projected edges can be stored in approximately 20 GB of RAM -- a compression ratio exceeding 2000:1 compared to materialized…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCloud Computing and Resource Management · Parallel Computing and Optimization Techniques · Software-Defined Networks and 5G