DAOS for Extreme-scale Systems in Scientific Applications
M. Scot Breitenfeld, Neil Fortner, Jordan Henderson, Jerome Soumagne,, Mohamad Chaarawi, Johann Lombardi, Quincey Koziol

TL;DR
This paper discusses the implementation and performance of HDF5 library over DAOS, a storage solution designed for exascale systems, enabling high-performance, fault-tolerant I/O for scientific applications.
Contribution
It introduces the integration of HDF5 with DAOS for exascale systems and evaluates its performance with scientific applications.
Findings
High bandwidth and IOPS achieved with DAOS-based HDF5
End-to-end data integrity and fault tolerance demonstrated
Performance benchmarks show suitability for exascale scientific workloads
Abstract
Exascale I/O initiatives will require new and fully integrated I/O models which are capable of providing straightforward functionality, fault tolerance and efficiency. One solution is the Distributed Asynchronous Object Storage (DAOS) technology, which is primarily designed to handle the next generation NVRAM and NVMe technologies envisioned for providing a high bandwidth/IOPS storage tier close to the compute nodes in an HPC system. In conjunction with DAOS, the HDF5 library, an I/O library for scientific applications, will support end-to-end data integrity, fault tolerance, object mapping, index building and querying. This paper details the implementation and performance of the HDF5 library built over DAOS by using three representative scientific application codes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Distributed and Parallel Computing Systems · Parallel Computing and Optimization Techniques
