Compilation Forking: A Fast and Flexible Way of Generating Data for Compiler-Internal Machine Learning Tasks
Raphael Mosaner (JKU Linz, Austria), David Leopoldseder (Oracle Labs, Vienna, Austria), Wolfgang Kisling (JKU Linz, Austria), Lukas Stadler (Oracle, Labs Linz, Austria), Hanspeter M\"ossenb\"ock (JKU Linz, Austria)

TL;DR
Compilation forking enables efficient, consistent data generation for machine learning in dynamic compilers, facilitating large-scale performance evaluation and optimization model training with minimal noise.
Contribution
The paper introduces compilation forking, a novel technique for generating consistent feature and performance data in dynamic compilers, improving data quality for machine learning-based optimizations.
Findings
ML models trained on forking data match heuristic performance.
Speedups of 20% achieved on some benchmarks.
Approach is applicable to any dynamic compiler.
Abstract
Compiler optimization decisions are often based on hand-crafted heuristics centered around a few established benchmark suites. Alternatively, they can be learned from feature and performance data produced during compilation. However, data-driven compiler optimizations based on machine learning models require large sets of quality data for training in order to match or even outperform existing human-crafted heuristics. In static compilation setups, related work has addressed this problem with iterative compilation. However, a dynamic compiler may produce different data depending on dynamically-chosen compilation strategies, which aggravates the generation of comparable data. We propose compilation forking, a technique for generating consistent feature and performance data from arbitrary, dynamically-compiled programs. Different versions of program parts with the same profiling and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
