TL;DR
This paper introduces a novel workload compression method that constructs representative and comprehensive workloads with formal guarantees, addressing load imbalance and performance issues in database systems.
Contribution
It formalizes workload representativity and coverage, proves NP-hardness, and proposes a greedy algorithm with approximation guarantees for workload compression.
Findings
The proposed algorithm outperforms sampling and clustering methods.
It effectively balances workload representativity and coverage.
Demonstrates advantages in workload analysis and system monitoring.
Abstract
This work studies the problem of constructing a representative workload from a given input analytical query workload where the former serves as an approximation with guarantees of the latter. We discuss our work in the context of workload analysis and monitoring. As an example, evolving system usage patterns in a database system can cause load imbalance and performance regressions which can be controlled by monitoring system usage patterns, i.e.,~a representative workload, over time. To construct such a workload in a principled manner, we formalize the notions of workload {\em representativity} and {\em coverage}. These metrics capture the intuition that the distribution of features in a compressed workload should match a target distribution, increasing representativity, and include common queries as well as outliers, increasing coverage. We show that solving this problem optimally is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
