A Greedy Search Tree Heuristic for Symbolic Regression
Fabricio Olivetti de Franca

TL;DR
This paper introduces a new data structure and heuristic for symbolic regression that constrains the search space, enabling more efficient and interpretable model discovery compared to traditional genetic programming methods.
Contribution
The paper proposes the Interaction-Transformation data structure and the SymTree heuristic, which together improve search efficiency and model interpretability in symbolic regression.
Findings
SymTree can find the optimal solution within the IT search space.
SymTree achieves competitive results even outside the IT search space.
The method balances accuracy and simplicity effectively.
Abstract
Symbolic Regression tries to find a mathematical expression that describes the relationship of a set of explanatory variables to a measured variable. The main objective is to find a model that minimizes the error and, optionally, that also minimizes the expression size. A smaller expression can be seen as an interpretable model considered a reliable decision model. This is often performed with Genetic Programming which represents their solution as expression trees. The shortcoming of this algorithm lies on this representation that defines a rugged search space and contains expressions of any size and difficulty. These pose as a challenge to find the optimal solution under computational constraints. This paper introduces a new data structure, called Interaction-Transformation (IT), that constrains the search space in order to exclude a region of larger and more complicated expressions.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
