DPDPU: Data Processing with DPUs
Jiasheng Hu, Philip A. Bernstein, Jialin Li, Qizhen Zhang

TL;DR
DPDPU is a platform designed to leverage Data Processing Units (DPUs) effectively, addressing hardware heterogeneity and optimizing data processing tasks to improve performance and reduce costs in cloud systems.
Contribution
It introduces a holistic platform that bridges the semantic gap between DPUs and data systems, with dedicated engines for compute, networking, and storage to optimize data processing.
Findings
Initial design and implementation of DPDPU components
Identification of utilization challenges for DPU engines
Progress towards integrating DPU capabilities into data processing workflows
Abstract
Improving the performance and reducing the cost of cloud data systems is increasingly challenging. Data processing units (DPUs) are a promising solution, but utilizing them for data processing needs characterizing the new hardware and recognizing their capabilities and constraints. We hence propose DPDPU, a platform for holistically exploiting DPUs to optimize data processing tasks that are critical to performance and cost. It seeks to fill the semantic gap between DPUs and data processing systems and handle DPU heterogeneity with three engines dedicated to compute, networking, and storage. This paper describes our vision, DPDPU's key components, their associated utilization challenges, as well as the current progress and future plans.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Machine Learning and Data Classification
