DDS: DPU-optimized Disaggregated Storage [Extended Report]
Qizhen Zhang, Philip Bernstein, Badrish Chandramouli, Jiasheng Hu,, Yiming Zheng

TL;DR
DDS is a novel disaggregated storage architecture leveraging DPUs to significantly improve throughput and latency while reducing CPU usage, with minimal modifications needed for existing DBMSs.
Contribution
This paper introduces DDS, a DPU-optimized disaggregated storage system that uses advanced hardware features and offloading techniques to enhance performance and efficiency.
Findings
Higher storage throughput achieved
Latency reduced by an order of magnitude
CPU core savings up to tens per server
Abstract
This extended report presents DDS, a novel disaggregated storage architecture enabled by emerging networking hardware, namely DPUs (Data Processing Units). DPUs can optimize the latency and CPU consumption of disaggregated storage servers. However, utilizing DPUs for DBMSs requires careful design of the network and storage paths and the interface exposed to the DBMS. To fully benefit from DPUs, DDS heavily uses DMA, zero-copy, and userspace I/O to minimize overhead when improving throughput. It also introduces an offload engine that eliminates host CPUs by executing client requests directly on the DPU. Adopting DDS' API requires minimal DBMS modification. Our experimental study and production system integration show promising results -- DDS achieves higher disaggregated storage throughput with an order of magnitude lower latency, and saves up to tens of CPU cores per storage server.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Advanced Data Storage Technologies · Parallel Computing and Optimization Techniques
