LLM-Guided Genetic Improvement: Envisioning Semantic Aware Automated Software Evolution

Karine Even-Mendoza; Alexander Brownlee; Alina Geiger; Carol Hanna; Justyna Petke; Federica Sarro; Dominik Sobania

arXiv:2508.18089·cs.SE·August 26, 2025

LLM-Guided Genetic Improvement: Envisioning Semantic Aware Automated Software Evolution

Karine Even-Mendoza, Alexander Brownlee, Alina Geiger, Carol Hanna, Justyna Petke, Federica Sarro, Dominik Sobania

PDF

TL;DR

This paper introduces PatchCat, a method that combines LLMs with genetic improvement to categorize software patches semantically, improving efficiency and interpretability in automated software evolution.

Contribution

It proposes a novel approach to integrate semantic-aware search with genetic improvement using LLMs and clustering, advancing automated software evolution.

Findings

01

PatchCat identified 18 patch types with high accuracy.

02

It can detect NoOp edits to save resources.

03

Works effectively with small, local LLMs.

Abstract

Genetic Improvement (GI) of software automatically creates alternative software versions that are improved according to certain properties of interests (e.g., running-time). Search-based GI excels at navigating large program spaces, but operates primarily at the syntactic level. In contrast, Large Language Models (LLMs) offer semantic-aware edits, yet lack goal-directed feedback and control (which is instead a strength of GI). As such, we propose the investigation of a new research line on AI-powered GI aimed at incorporating semantic aware search. We take a first step at it by augmenting GI with the use of automated clustering of LLM edits. We provide initial empirical evidence that our proposal, dubbed PatchCat, allows us to automatically and effectively categorize LLM-suggested patches. PatchCat identified 18 different types of software patches and categorized newly suggested patches…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.