Compiling Database Application Programs
Mohammad Dashti, Sachin Basil John, Thierry Coppey, Amir Shaikhha,, Vojin Jovanovic, Christoph Koch

TL;DR
This paper presents an automated compiler that specializes database application programs, significantly improving performance over manual optimization by applying advanced compiler techniques and generative programming.
Contribution
It introduces a novel compiler approach for database scripts that automates key optimizations, outperforming manual tuning on a standard benchmark.
Findings
Automated compiler outperforms manual baseline by a factor of two.
Key optimization techniques are identified and their individual impacts analyzed.
Compiler automates complex optimizations for database application programs.
Abstract
There is a trend towards increased specialization of data management software for performance reasons. In this paper, we study the automatic specialization and optimization of database application programs -- sequences of queries and updates, augmented with control flow constructs as they appear in database scripts, UDFs, transactional workloads and triggers in languages such as PL/SQL. We show how to build an optimizing compiler for database application programs using generative programming and state-of-the-art compiler technology. We evaluate a hand-optimized low-level implementation of TPC-C, and identify the key optimization techniques that account for its good performance. Our compiler fully automates these optimizations and, applied to this benchmark, outperforms the manually optimized baseline by a factor of two. By selectively disabling some of the optimizations in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Scientific Computing and Data Management · Distributed systems and fault tolerance
