Novel Selectivity Estimation Strategy for Modern DBMS
Jun Hyung Shin

TL;DR
This paper introduces a novel selectivity estimation method for modern DBMS that computes exact selectivity during query optimization with minimal overhead, significantly improving query performance.
Contribution
The paper presents a new approach to exact selectivity estimation using aggregate queries during optimization, suitable for in-memory and GPU-accelerated databases.
Findings
Achieves less than 30 ms overhead on GPU
Improves query performance up to 4.8x on TPC-H
Enhances query plans with better selectivity estimates
Abstract
Selectivity estimation is important in query optimization, however accurate estimation is difficult when predicates are complex. Instead of existing database synopses and statistics not helpful for such cases, we introduce a new approach to compute the exact selectivity by running an aggregate query during the optimization phase. Exact selectivity can be achieved without significant overhead for in-memory and GPU-accelerated databases by adding extra query execution calls. We implement a selection push-down extension based on the novel selectivity estimation strategy in the MapD database system. Our approach records constant and less than 30 millisecond overheads in any circumstances while running on GPU. The novel strategy successfully generates better query execution plans which result in performance improvement up to 4.8 times from TPC-H benchmark SF-50 queries and 7.3 times from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Cloud Computing and Resource Management · Advanced Data Storage Technologies
