Pebbles, Graphs, and a Pinch of Combinatorics: Towards Tight I/O Lower Bounds for Statically Analyzable Programs
Grzegorz Kwasniewski, Tal Ben-Nun, Lukas Gianinazzi, Alexandru, Calotoiu, Timo Schneider, Alexandros Nikolaos Ziogas, Maciej Besta, Torsten, Hoefler

TL;DR
This paper introduces a novel method for deriving precise I/O lower bounds for a broad class of programs, enabling better understanding of communication costs in scientific and machine learning applications.
Contribution
The paper presents a new approach using combinatorial methods and the red-blue pebble game to obtain tight I/O bounds for Simple Overlap Access Programs (SOAP), covering diverse algorithms.
Findings
Derived tight I/O bounds for linear algebra kernels like Cholesky decomposition.
Improved existing bounds for stencil applications by up to a factor of 14.
Analyzed 38 applications, including deep learning and physics simulations, with an open-source tool.
Abstract
Determining I/O lower bounds is a crucial step in obtaining communication-efficient parallel algorithms, both across the memory hierarchy and between processors. Current approaches either study specific algorithms individually, disallow programmatic motifs such as recomputation, or produce asymptotic bounds that exclude important constants. We propose a novel approach for obtaining precise I/O lower bounds on a general class of programs, which we call Simple Overlap Access Programs (SOAP). SOAP analysis covers a wide variety of algorithms, from ubiquitous computational kernels to full scientific computing applications. Using the red-blue pebble game and combinatorial methods, we are able to bound the I/O of the SOAP-induced Computational Directed Acyclic Graph (CDAG), taking into account multiple statements, input/output reuse, and optimal tiling. To deal with programs that are outside…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Advanced Neural Network Applications
