DIAL: Decentralized I/O AutoTuning via Learned Client-side Local Metrics for Parallel File System
Md Hasanur Rashid, Xinyi Li, Youbiao He, Forrest Sheng Bao, Dong Dai

TL;DR
DIAL introduces a decentralized, machine learning-based approach for client-side I/O autotuning in parallel file systems, reducing overhead and improving global I/O performance by relying solely on local metrics.
Contribution
DIAL's novel decentralized approach enables effective I/O autotuning using only local metrics, avoiding heavy global metric overheads and enhancing performance.
Findings
Achieves better I/O performance through decentralized tuning.
Reduces overhead by relying on local metrics.
Enables timely, collective decision-making among clients.
Abstract
Enabling efficient, high-performance data access in parallel file systems (PFS) is critical for today's high-performance computing systems. PFS client-side I/O heavily impacts the final I/O performance delivered to individual applications and the entire system. Autotuning the key client-side I/O behaviors has been extensively studied and shows promising results. However, existing work has heavily relied on extensive number of global runtime metrics to monitor and accurate modeling of applications' I/O patterns. Such heavy overheads significantly limit the ability to enable fine-grained, dynamic tuning in practical systems. In this study, we propose DIAL (Decentralized I/O AutoTuning via Learned Client-side Local Metrics) which takes a drastically different approach. Instead of trying to extract the global I/O patterns of applications, DIAL takes a decentralized approach, treating each…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Distributed and Parallel Computing Systems · Cloud Computing and Resource Management
