Novel Selectivity Estimation Strategy for Modern DBMS

Jun Hyung Shin

arXiv:1806.08384·cs.DB·June 25, 2018

Novel Selectivity Estimation Strategy for Modern DBMS

Jun Hyung Shin

PDF

Open Access

TL;DR

This paper introduces a novel selectivity estimation method for modern DBMS that computes exact selectivity during query optimization with minimal overhead, significantly improving query performance.

Contribution

The paper presents a new approach to exact selectivity estimation using aggregate queries during optimization, suitable for in-memory and GPU-accelerated databases.

Findings

01

Achieves less than 30 ms overhead on GPU

02

Improves query performance up to 4.8x on TPC-H

03

Enhances query plans with better selectivity estimates

Abstract

Selectivity estimation is important in query optimization, however accurate estimation is difficult when predicates are complex. Instead of existing database synopses and statistics not helpful for such cases, we introduce a new approach to compute the exact selectivity by running an aggregate query during the optimization phase. Exact selectivity can be achieved without significant overhead for in-memory and GPU-accelerated databases by adding extra query execution calls. We implement a selection push-down extension based on the novel selectivity estimation strategy in the MapD database system. Our approach records constant and less than 30 millisecond overheads in any circumstances while running on GPU. The novel strategy successfully generates better query execution plans which result in performance improvement up to 4.8 times from TPC-H benchmark SF-50 queries and 7.3 times from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDistributed and Parallel Computing Systems · Cloud Computing and Resource Management · Advanced Data Storage Technologies