Optimizing Noncontiguous Accesses in MPI-IO
Rajeev Thakur, William Gropp, Ewing Lusk

TL;DR
This paper demonstrates how MPI-IO's support for noncontiguous data access significantly improves I/O performance in parallel applications, through specific optimizations like data sieving and collective I/O, validated across multiple systems.
Contribution
It introduces a classification of MPI-IO access patterns, highlights the performance benefits of collective noncontiguous requests, and details portable optimizations in ROMIO for high-performance I/O.
Findings
Performance improves dramatically with level-3 requests
ROMIO's data sieving and collective I/O optimize noncontiguous access
High performance achieved across diverse systems
Abstract
The I/O access patterns of many parallel applications consist of accesses to a large number of small, noncontiguous pieces of data. If an application's I/O needs are met by making many small, distinct I/O requests, however, the I/O performance degrades drastically. To avoid this problem, MPI-IO allows users to access noncontiguous data with a single I/O function call, unlike in Unix I/O. In this paper, we explain how critical this feature of MPI-IO is for high performance and how it enables implementations to perform optimizations. We first provide a classification of the different ways of expressing an application's I/O needs in MPI-IO--we classify them into four levels, called level~0 through level~3. We demonstrate that, for applications with noncontiguous access patterns, the I/O performance improves dramatically if users write their applications to make level-3 requests…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Parallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems
