QCFE: An efficient Feature engineering for query cost estimation
Yu Yan, Hongzhi Wang, Junfang Huang, Dake Zhong, Man Yang, Kaixin, Zhang, Tao Yu, Tianqing Wan

TL;DR
This paper introduces QCFE, a novel feature engineering approach for query cost estimation that incorporates additional influential variables and reduces feature redundancy, significantly improving accuracy and efficiency.
Contribution
The paper proposes a new feature snapshot and a difference-propagation feature reduction method to enhance query cost estimation accuracy and efficiency.
Findings
Significant improvement in time-accuracy efficiency on benchmarks.
Effective integration of storage, hardware, and database variables.
Reduced computational burden through feature reduction.
Abstract
Query cost estimation is a classical task for database management. Recently, researchers apply the AI-driven model to implement query cost estimation for achieving high accuracy. However, two defects of feature design lead to poor cost estimation accuracy-time efficiency. On the one hand, existing works only encode the query plan and data statistics while ignoring some other important variables, like storage structure, hardware, database knobs, etc. These variables also have significant impact on the query cost. On the other hand, due to the straightforward encoding design, existing works suffer heavy representation learning burden on ineffective dimensions of input. To meet the above two problems, we first propose an efficient feature engineering for query cost estimation, called QCFE. Specifically, we design a novel feature called feature snapshot to efficiently integrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Stream Mining Techniques · Web Data Mining and Analysis
