Icicle: Scalable Metadata Indexing and Real-Time Monitoring for HPC File Systems
Haochen Pan, Ryan Chard, Song Young Oh, Maxime Gonthier, Val\'erie Hayot-Sasson, Geoffrey Lentner, Joe Bottigliero, Rachana Ananthakrishnan, Kyle Chard, Ian Foster

TL;DR
Icicle is a scalable, real-time metadata indexing and monitoring framework for HPC file systems, enabling efficient queries and analytics over billions of files using Apache Kafka and Flink.
Contribution
It introduces a continuous, fault-tolerant indexing system supporting real-time synchronization with production HPC file systems, outperforming existing methods.
Findings
Order-of-magnitude throughput improvements over existing approaches
Supports both bulk and real-time metadata ingestion
Enables efficient queries and analytics on billions of files
Abstract
Modern HPC file systems can contain billions of files and hundreds of petabytes of data, making even simple questions increasingly intractable to answer. Traditional file system utilities such as find and du fail to scale to these sizes. While external indexing tools like GUFI and Brindexer improve query performance, they remain batch-oriented and unsuitable for heterogeneous, rapidly evolving environments. We present Icicle, a scalable framework for continuous file system metadata indexing and monitoring. Icicle maintains a unified, up-to-date, and queryable view of file system state while supporting both periodic snapshot-based ingestion for bulk metadata updates and event-based ingestion for real-time synchronization from production systems such as Lustre and IBM Storage Scale. Built on Apache Kafka and Apache Flink, Icicle provides high-throughput, fault-tolerant, and horizontally…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
