Dynamic Adaptation in Data Storage: Real-Time Machine Learning for Enhanced Prefetching
Chiyu Cheng, Chang Zhou, Yang Zhao, Jin Cao

TL;DR
This paper presents a novel real-time machine learning framework for data prefetching in multi-tiered storage systems, significantly improving prediction accuracy and system adaptability over traditional methods.
Contribution
It introduces an innovative streaming machine learning approach for dynamic data prefetching, enhancing storage management efficiency and responsiveness.
Findings
Improved prediction accuracy for file access patterns
Enhanced memory efficiency in storage management
Demonstrated real-time adaptability in production environments
Abstract
The exponential growth of data storage demands has necessitated the evolution of hierarchical storage management strategies [1]. This study explores the application of streaming machine learning [3] to revolutionize data prefetching within multi-tiered storage systems. Unlike traditional batch-trained models, streaming machine learning [5] offers adaptability, real-time insights, and computational efficiency, responding dynamically to workload variations. This work designs and validates an innovative framework that integrates streaming classification models for predicting file access patterns, specifically the next file offset. Leveraging comprehensive feature engineering and real-time evaluation over extensive production traces, the proposed methodology achieves substantial improvements in prediction accuracy, memory efficiency, and system adaptability. The results underscore the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Distributed and Parallel Computing Systems
