CannyFS: Opportunistically Maximizing I/O Throughput Exploiting the Transactional Nature of Batch-Mode Data Processing
Jessica Nettelblad, Carl Nettelblad

TL;DR
CannyFS is a user mode file system that improves I/O throughput in HPC environments by treating I/O operations as transactions, reducing latency and execution time significantly.
Contribution
The paper introduces CannyFS, a novel user mode file system that exploits transactional assumptions to maximize I/O throughput in high-performance computing.
Findings
Over 80% reduction in task completion time.
Effective in archive extraction and directory removal tasks.
Leverages transaction-like semantics to hide latency.
Abstract
We introduce a user mode file system, CannyFS, that hides latency by assuming all I/O operations will succeed. The user mode process will in turn report errors, allowing proper cleanup and a repeated attempt to take place. We demonstrate benefits for the model tasks of extracting archives and removing directory trees in a real-life HPC environment, giving typical reductions in time use of over 80%. This approach can be considered a view of HPC jobs and their I/O activity as transactions. In general, file systems lack clearly defined transaction semantics. Over time, the competing trends to add cache and maintain data integrity have resulted in different practical tradeoffs. High-performance computing is a special case where overall throughput demands are high. Latency can also be high, with non-local storage. In addition, a theoretically possible I/O error (like permission denied,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Distributed systems and fault tolerance · Parallel Computing and Optimization Techniques
