A Moveable Beast: Partitioning Data and Compute for Computational Storage
Aldrin Montana, Yuanqing Xue, Jeff LeFevre, Carlos Maltzahn, and Josh Stuart, Philip Kufeldt, Peter Alvaro

TL;DR
This paper introduces Skytether, a prototype computational storage system that dynamically partitions data and computation to improve performance and utilization in data-intensive applications, especially for scientific data management.
Contribution
It presents a novel approach to computational storage with decomposable queries and dynamic partitioning, addressing limitations of static, design-time solutions.
Findings
Observed a 15x slowdown on CSDs compared to CPU, informing cost models.
Evaluated partition strategies and function execution overhead.
Demonstrated performance of selection and projection operations.
Abstract
Over the years, hardware trends have introduced various heterogeneous compute units while also bringing network and storage bandwidths within an order of magnitude of memory subsystems. In response, developers have used increasingly exotic solutions to extract more performance from hardware; typically relying on static, design-time partitioning of their programs which cannot keep pace with storage systems that are layering compute units throughout deepening hierarchies of storage devices. We argue that dynamic, just-in-time partitioning of computation offers a solution for emerging data-intensive systems to overcome ever-growing data sizes in the face of stalled CPU performance and memory bandwidth. In this paper, we describe our prototype computational storage system (CSS), Skytether, that adopts a database perspective to utilize computational storage drives (CSDs). We also present…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Distributed and Parallel Computing Systems · Parallel Computing and Optimization Techniques
