Outlier Detection Techniques for SQL and ETL Tuning
Saptarsi Goswami, Samiran Ghosh, Amlan Chakrabarti

TL;DR
This paper investigates outlier detection methods for SQL and ETL query tuning to identify inefficiencies and improve system performance, using multiple techniques on real query data and exploring ensemble approaches.
Contribution
It introduces a systematic approach to detect query outliers in RDBMS, applying four techniques and analyzing their effectiveness for tuning and optimization.
Findings
Outlier detection reveals query inefficiencies.
Ensemble methods improve detection accuracy.
Potential for optimizing database performance.
Abstract
RDBMS is the heart for both OLTP and OLAP types of applications. For both types of applications thousands of queries expressed in terms of SQL are executed on daily basis. All the commercial DBMS engines capture various attributes in system tables about these executed queries. These queries need to conform to best practices and need to be tuned to ensure optimal performance. While we use checklists, often tools to enforce the same, a black box technique on the queries for profiling, outlier detection is not employed for a summary level understanding. This is the motivation of the paper, as this not only points out to inefficiencies built in the system, but also has the potential to point evolving best practices and inappropriate usage. Certainly this can reduce latency in information flow and optimal utilization of hardware and software capacity. In this paper we start with formulating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Network Security and Intrusion Detection · Data Stream Mining Techniques
