Data Warehouse Benchmarking with DWEB
J\'er\^ome Darmont (ERIC)

TL;DR
This paper introduces DWEB, a flexible and fully parameterized benchmark for generating synthetic data warehouses and workloads, aiding performance evaluation and system tuning in data warehousing and OLAP contexts.
Contribution
It presents DWEB, a novel, easily configurable benchmark tool with new ETL features and execution protocols, filling the gap of available data warehouse benchmarks.
Findings
DWEB can generate diverse synthetic data warehouses.
DWEB's new ETL feature improves workload simulation.
Java implementation is freely available online.
Abstract
Performance evaluation is a key issue for designers and users of Database Management Systems (DBMSs). Performance is generally assessed with software benchmarks that help, e.g., test architectural choices, compare different technologies or tune a system. In the particular context of data warehousing and On-Line Analytical Processing (OLAP), although the Transaction Processing Performance Council (TPC) aims at issuing standard decision-support benchmarks, few benchmarks do actually exist. We present in this chapter the Data Warehouse Engineering Benchmark (DWEB), which allows generating various ad-hoc synthetic data warehouses and workloads. DWEB is fully parameterized to fulfill various data warehouse design needs. However, two levels of parameterization keep it relatively easy to tune. We also expand on our previous work on DWEB by presenting its new Extract, Transform, and Load (ETL)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
