Finding Missed Code Size Optimizations in Compilers using LLMs

Davide Italiano; Chris Cummins

arXiv:2501.00655·cs.SE·January 3, 2025

Finding Missed Code Size Optimizations in Compilers using LLMs

Davide Italiano, Chris Cummins

PDF

Open Access

TL;DR

This paper presents a simple, extensible method using large language models and differential testing to identify missed code size optimizations in C/C++ compilers, successfully finding 24 bugs across multiple languages.

Contribution

It introduces a novel, minimalistic approach combining LLMs with differential testing to detect compiler optimization bugs, adaptable to multiple programming languages.

Findings

01

Reported 24 confirmed bugs in production compilers

02

Successfully extended approach from C/C++ to Rust and Swift

03

Approach requires fewer than 150 lines of code

Abstract

Compilers are complex, and significant effort has been expended on testing them. Techniques such as random program generation and differential testing have proved highly effective and have uncovered thousands of bugs in production compilers. The majority of effort has been expended on validating that a compiler produces correct code for a given input, while less attention has been paid to ensuring that the compiler produces performant code. In this work we adapt differential testing to the task of identifying missed optimization opportunities in compilers. We develop a novel testing approach which combines large language models (LLMs) with a series of differential testing strategies and use them to find missing code size optimizations in C / C++ compilers. The advantage of our approach is its simplicity. We offload the complex task of generating random code to an off-the-shelf LLM,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Testing and Debugging Techniques · Software Engineering Research · Advanced Malware Detection Techniques

MethodsSoftmax · Attention Is All You Need