Small Language Models as Compiler Experts: Auto-Parallelization for Heterogeneous Systems

Prathamesh Devadiga

arXiv:2512.19250·cs.LG·December 23, 2025

Small Language Models as Compiler Experts: Auto-Parallelization for Heterogeneous Systems

Prathamesh Devadiga

PDF

Open Access

TL;DR

This paper demonstrates that small language models can effectively serve as compiler experts for auto-parallelization, achieving significant speedups on heterogeneous systems across various real-world kernels.

Contribution

It introduces a novel approach using small language models for auto-parallelization, outperforming traditional compiler heuristics on heterogeneous hardware.

Findings

01

Average speedup of 6.81x across benchmarks

02

Peak performance of 43.25x on convolution operations

03

Robustness verified across multiple hardware platforms

Abstract

Traditional auto-parallelizing compilers, reliant on rigid heuristics, struggle with the complexity of modern heterogeneous systems. This paper presents a comprehensive evaluation of small (approximately 1B parameter) language-model-driven compiler auto-parallelization. We evaluate three models: gemma3, llama3.2, and qwen2.5, using six reasoning strategies across 11 real-world kernels drawn from scientific computing, graph algorithms, and machine learning. Our system is benchmarked against strong compiler baselines, including LLVM Polly, TVM, and Triton. Across 376 total evaluations, the proposed approach achieves an average speedup of 6.81x and a peak performance of 43.25x on convolution operations. We analyze scalability, verify correctness using multiple sanitizers, and confirm robustness across diverse compilers and hardware platforms. Our results demonstrate that small, efficient…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Big Data and Digital Economy · Natural Language Processing Techniques