Parallel netCDF: A Scientific High-Performance I/O Interface
Jianwei Li, Wei-keng Liao, Alok Choudhary, Robert Ross, Rajeev Thakur,, William Gropp, Rob Latham

TL;DR
This paper introduces a parallel netCDF interface that enables high-performance, collective I/O operations for scientific datasets, significantly improving data access efficiency while maintaining compatibility with the serial netCDF interface.
Contribution
It presents a new parallel netCDF interface derived with minimal changes from the serial version, optimized for high-performance parallel I/O using MPI-IO.
Findings
Significant performance improvements over serial netCDF.
Compatibility with existing netCDF applications.
Effective use of MPI-IO collective I/O optimizations.
Abstract
Dataset storage, exchange, and access play a critical role in scientific applications. For such purposes netCDF serves as a portable and efficient file format and programming interface, which is popular in numerous scientific application domains. However, the original interface does not provide an efficient mechanism for parallel data storage and access. In this work, we present a new parallel interface for writing and reading netCDF datasets. This interface is derived with minimum changes from the serial netCDF interface but defines semantics for parallel access and is tailored for high performance. The underlying parallel I/O is achieved through MPI-IO, allowing for dramatic performance gains through the use of collective I/O optimizations. We compare the implementation strategies with HDF5 and analyze both. Our tests indicate programming convenience and significant I/O performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Parallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems
