REASONING COMPILER: LLM-Guided Optimizations for Efficient Model Serving
Annabelle Sujun Tang, Christopher Priebe, Rohan Mahapatra, Lianhui Qin, Hadi Esmaeilzadeh

TL;DR
This paper introduces REASONING COMPILER, a novel framework that uses large language models and structured Monte Carlo tree search to improve compiler optimizations for neural model serving, achieving faster results with fewer samples.
Contribution
It presents a new LLM-guided, context-aware compiler optimization method that enhances sample efficiency without retraining, outperforming existing neural compiler techniques.
Findings
Achieves significant speedups with fewer samples.
Leverages LLM reasoning for context-aware optimization.
Outperforms existing neural compiler methods.
Abstract
While model serving has unlocked unprecedented capabilities, the high cost of serving large-scale models continues to be a significant barrier to widespread accessibility and rapid innovation. Compiler optimizations have long driven substantial performance improvements, but existing compilers struggle with neural workloads due to the exponentially large and highly interdependent space of possible transformations. Although existing stochastic search techniques can be effective, they are often sample-inefficient and fail to leverage the structural context underlying compilation decisions. We set out to investigate the research question of whether reasoning with large language models (LLMs), without any retraining, can leverage the context-aware decision space of compiler optimizations to significantly improve sample efficiency. To that end, we introduce a novel compilation framework…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsModel-Driven Software Engineering Techniques · Software Testing and Debugging Techniques · Distributed and Parallel Computing Systems
MethodsSparse Evolutionary Training
