TL;DR
This paper introduces IDEBench, a new benchmark designed specifically for evaluating database systems in interactive data exploration scenarios, focusing on ad-hoc, incremental queries and their performance-quality trade-offs.
Contribution
The paper presents IDEBench, a novel benchmark tailored for IDE workloads, and evaluates various database systems using this benchmark to highlight its effectiveness.
Findings
Benchmarked commercial and research IDE query engines.
Identified trade-offs between query performance and result quality.
Provided insights into system suitability for IDE tasks.
Abstract
Existing benchmarks for analytical database systems such as TPC-DS and TPC-H are designed for static reporting scenarios. The main metric of these benchmarks is the performance of running individual SQL queries over a synthetic database. In this paper, we argue that such benchmarks are not suitable for evaluating database workloads originating from interactive data exploration (IDE) systems where most queries are ad-hoc, not based on predefined reports, and built incrementally. As a main contribution, we present a novel benchmark called IDEBench that can be used to evaluate the performance of database systems for IDE workloads. As opposed to traditional benchmarks for analytical database systems, our goal is to provide more meaningful workloads and datasets that can be used to benchmark IDE query engines, with a particular focus on metrics that capture the trade-off between query…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
