# Guided Automated Learning for query workload re-Optimization

**Authors:** Guilherme Damasio, Vincent Corvinelli, Parke Godfrey, Piotr, Mierzejewski, Alexandar Mihaylov, Jaroslaw Szlichta, Calisto Zuzarte

arXiv: 1901.02049 · 2019-05-23

## TL;DR

GALO automates query workload re-optimization by learning recurring problematic query plan patterns offline and applying knowledge-based rewrites online, significantly improving database query performance.

## Contribution

The paper introduces GALO, a novel system that automates query plan problem detection and rewriting using a knowledge base built on RDF and SPARQL standards.

## Key findings

- Effective in improving query performance on synthetic and real workloads
- Automates detection and correction of recurring query plan issues
- Facilitates debugging and optimization refinement for database experts

## Abstract

Query optimization is a hallmark of database systems enabling complex SQL queries of today's applications to be run efficiently. The query optimizer often fails to find the best plan, when logical subtleties in business queries and schemas circumvent it. When a query runs more expensively than is viable or warranted, determination of the performance issues is usually performed manually in consultation with experts through the analysis of query's execution plan (QEP). However, this is an excessively time consuming, human error-prone, and costly process. GALO is a novel system that automates this process. The tool automatically learns recurring problem patterns in query plans over workloads in an offline learning phase, to build a knowledge base of plan-rewrite remedies. It then uses the knowledge base online to re-optimize queries queued for execution to improve performance, often quite drastically.   GALO's knowledge base is built on RDF and SPARQL, W3C graph database standards, which is well suited for manipulating and querying over SQL query plans, which are graphs themselves. GALO acts as a third-tier of re-optimization, after query rewrite and cost-based optimization, as a query plan rewrite. Since the knowledge base is not tied to the context of supplied QEPs, table and column names are matched automatically during the re-optimization phase. Thus, problem patterns learned over a particular query workload can be applied in other query workloads. GALO's knowledge base is also an invaluable tool for database experts to debug query performance issues by tracking to known issues and solutions as well as refining the optimizer with new tuned techniques by the development team. We demonstrate an experimental study of the effectiveness of our techniques over synthetic TPC-DS and real IBM client query workloads.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.02049/full.md

## Figures

27 figures with captions in the complete paper: https://tomesphere.com/paper/1901.02049/full.md

## References

25 references — full list in the complete paper: https://tomesphere.com/paper/1901.02049/full.md

---
Source: https://tomesphere.com/paper/1901.02049