IOPathTune: Adaptive Online Parameter Tuning for Parallel File System I/O Path
Md. Hasanur Rashid, Youbiao He, Forrest Sheng Bao, Dong Dai

TL;DR
IOPathTune is an adaptive online tuning system for parallel file system I/O paths that improves performance without workload profiling or inter-machine communication, achieving significant gains over default settings and existing methods.
Contribution
It introduces a novel online tuning approach for PFS I/O paths that is adaptive, workload-agnostic, and does not require profiling or communication, enhancing performance in HPC environments.
Findings
Achieves up to 231% performance improvement over default settings.
Delivers 89.57% better performance than CAPES in multi-client scenarios.
Effective across diverse workloads on Lustre file systems.
Abstract
Parallel file systems contain complicated I/O paths from clients to storage servers. An efficient I/O path requires proper settings of multiple parameters, as the default settings often fail to deliver optimal performance, especially for diverse workloads in the HPC environment. Existing tuning strategies have shortcomings in being adaptive, timely, and flexible. We propose IOPathTune, which adaptively tunes PFS I/O Path online from the client side without characterizing the workloads, doing expensive profiling, and communicating with other machines. We implemented IOPathTune on Lustre and leveraged CloudLab to conduct the evaluations on 20 different Filebench workloads in three different scenarios. We observed either on-par or better performance than the default configuration, as high as 231% on standalone executions. IOPathTune also delivers 89.57% better overall performance than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Distributed and Parallel Computing Systems · Cloud Computing and Resource Management
