Q-error Bounds of Random Uniform Sampling for Cardinality Estimation

Beibin Li; Yao Lu; Chi Wang; Srikanth Kandula

arXiv:2108.02715·math.ST·October 1, 2021·1 cites

Q-error Bounds of Random Uniform Sampling for Cardinality Estimation

Beibin Li, Yao Lu, Chi Wang, Srikanth Kandula

PDF

Open Access

TL;DR

This paper analyzes the Q-error bounds of random uniform sampling for cardinality estimation, providing guidelines on sample size based on confidence intervals and true cardinality.

Contribution

It offers the first analysis of Q-error bounds for uniform sampling in cardinality estimation, establishing practical sample size rules.

Findings

01

Upper Q-error bound depends on sample size and true cardinality.

02

Provides confidence interval analysis for sampling with and without replacement.

03

Offers practical guidelines for sample size selection in cardinality estimation.

Abstract

Random uniform sampling has been studied in various statistical tasks but few of them have covered the Q-error metric for cardinality estimation (CE). In this paper, we analyze the confidence intervals of random uniform sampling with and without replacement for single-table CE. Results indicate that the upper Q-error bound depends on the sample size and true cardinality. Our bound gives a rule-of-thumb for how large a sample should be kept for single-table CE.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Statistical Methods and Inference · Bayesian Methods and Mixture Models