The Inefficiency of Genetic Programming for Symbolic Regression

Gabriel Kronberger; Fabricio Olivetti de Franca; Harry Desmond; Deaglan J. Bartlett; Lukas Kammerer

arXiv:2404.17292·cs.NE·March 30, 2026·1 cites

The Inefficiency of Genetic Programming for Symbolic Regression

Gabriel Kronberger, Fabricio Olivetti de Franca, Harry Desmond, Deaglan J. Bartlett, Lukas Kammerer

PDF

TL;DR

This paper investigates the limitations of genetic programming in symbolic regression by exhaustively analyzing search behavior and comparing it to random search, revealing inefficiencies in exploring the solution space.

Contribution

It introduces improved algorithms for equality saturation to efficiently enumerate semantically unique expressions, enabling a detailed analysis of genetic programming's search behavior.

Findings

01

Genetic programming explores only a small fraction of unique expressions.

02

It repeatedly evaluates expressions congruent to already visited ones.

03

The analysis is based on real-world datasets like Nikuradse and galaxy dynamics.

Abstract

We analyse the search behaviour of genetic programming for symbolic regression in practically relevant but limited settings, allowing exhaustive enumeration of all solutions. This enables us to quantify the success probability of finding the best possible expressions, and to compare the search efficiency of genetic programming to random search in the space of semantically unique expressions. This analysis is made possible by improved algorithms for equality saturation, which we use to improve the Exhaustive Symbolic Regression algorithm; this produces the set of semantically unique expression structures, orders of magnitude smaller than the full symbolic regression search space. We compare the efficiency of random search in the set of unique expressions and genetic programming. For our experiments we use two real-world datasets where symbolic regression has been used to produce…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.