DuckDB on xNVMe
Marius Ottosen, Magnus Keinicke Parlo, Philippe Bonnet

TL;DR
This paper explores direct access to NVMe SSDs by DuckDB, bypassing traditional file systems to improve performance, demonstrating significant speed-ups in query execution through asynchronous I/O leveraging xNVMe.
Contribution
It introduces a novel approach for DuckDB to directly access NVMe SSDs using xNVMe, bypassing POSIX interfaces for enhanced performance.
Findings
Significant speed-up over baseline in query performance
Performance improves with larger scale factors
Linux NVMe passthru further enhances speed
Abstract
DuckDB is designed for portability. It is also designed to run anywhere, and possibly in contexts where it can be specialized for performance, e.g., as a cloud service or on a smart device. In this paper, we consider the way DuckDB interacts with local storage. Our long term research question is whether and how SSDs could be co-designed with DuckDB. As a first step towards vertical integration of DuckDB and programmable SSDs, we consider whether and how DuckDB can access NVMe SSDs directly. By default, DuckDB relies on the POSIX file interface. In contrast, we rely on the xNVMe library and explore how it can be leveraged in DuckDB. We leverage the block-based nature of the DuckDB buffer manager to bypass the synchronous POSIX I/O interface, the file system and the block manager. Instead, we directly issue asynchronous I/Os against the SSD logical block address space. Our preliminary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Cloud Computing and Resource Management · Parallel Computing and Optimization Techniques
