Towards Performance-Aware Allocation for Accelerated Machine Learning on GPU-SSD Systems
Ayush Gundawar, Euijun Chung, Hyesoon Kim

TL;DR
This paper presents MQMS, a novel GPU-SSD system architecture that intelligently manages data placement and scheduling to significantly improve performance for large, data-intensive machine learning workloads.
Contribution
MQMS introduces a performance-aware, in-storage GPU architecture with dynamic address allocation and fine-grained mapping to optimize data handling and overcome bottlenecks.
Findings
Orders-of-magnitude improvements in I/O throughput.
Significant reductions in device response time.
Faster simulation end times for large workloads.
Abstract
The exponential growth of data-intensive machine learning workloads has exposed significant limitations in conventional GPU-accelerated systems, especially when processing datasets exceeding GPU DRAM capacity. We propose MQMS, an augmented in-storage GPU architecture and simulator that is aware of internal SSD states and operations, enabling intelligent scheduling and address allocation to overcome performance bottlenecks caused by CPU-mediated data access patterns. MQMS introduces dynamic address allocation to maximize internal parallelism and fine-grained address mapping to efficiently handle small I/O requests without incurring read-modify-write overheads. Through extensive evaluations on workloads ranging from large language model inference to classical machine learning algorithms, MQMS demonstrates orders-of-magnitude improvements in I/O request throughput, device response time,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Distributed and Parallel Computing Systems · Cloud Computing and Resource Management
