Simple Baselines are Competitive with Code Evolution
Yonatan Gideoni, Sebastian Risi, Yarin Gal

TL;DR
This paper demonstrates that simple baseline methods can match or outperform complex code evolution techniques across various domains, highlighting the importance of search space design and evaluation practices.
Contribution
It shows that simple baselines are competitive with sophisticated code evolution methods and identifies key factors affecting their performance, proposing improved evaluation strategies.
Findings
Simple baselines match or outperform complex methods in all tested domains.
Search space design and domain knowledge are primary factors influencing performance.
High variance and small datasets lead to suboptimal scaffold selection, favoring hand-designed solutions.
Abstract
Code evolution is a family of techniques that rely on large language models to search through possible computer programs by evolving or mutating existing code. Many proposed code evolution pipelines show impressive performance but are often not compared to simpler baselines. We test how well two simple baselines do over three domains: finding better mathematical bounds, designing agentic scaffolds, and machine learning competitions. We find that simple baselines match or exceed much more sophisticated methods in all three. By analyzing these results we find various shortcomings in how code evolution is both developed and used. For the mathematical bounds, a problem's search space and domain knowledge in the prompt are chiefly what dictate a search's performance ceiling and efficiency, with the code evolution pipeline being secondary. Thus, the primary challenge in finding improved…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Algorithms and Applications · Software Engineering Research · Software Testing and Debugging Techniques
