Why Architecture Choice Matters in Symbolic Regression

Chakshu Gupta

arXiv:2604.23256·cs.NE·April 28, 2026

Why Architecture Choice Matters in Symbolic Regression

Chakshu Gupta

PDF

TL;DR

This paper investigates how the choice of tree structure in symbolic regression impacts the ability of gradient descent methods to recover target formulas, revealing that optimization landscape plays a crucial role beyond expressiveness.

Contribution

It demonstrates that the structure of the symbolic regression model significantly influences target recovery, emphasizing the importance of optimization landscape over mere expressiveness.

Findings

01

Different tree structures lead to vastly different recovery success rates.

02

More expressive structures do not always outperform restricted ones in practice.

03

The optimization landscape, not just expressiveness, determines recovery success.

Abstract

Symbolic regression discovers mathematical formulas from data. Some methods fix a tree of operators, assign learnable weights, and train by gradient descent. The tree's structure, which determines what operators and variables appear at each position, is chosen once and applied to every target. This paper tests whether that choice affects which targets are actually recovered. Three structures are compared, all sharing the same operator and target language but differing in how variables enter the tree; one is strictly more expressive. Across over 12,700 training runs, one structure recovers a target at 100% while another scores 0%, and the ranking reverses on a different target. Expressiveness guarantees that a solution exists in the search space, but not that gradient descent finds it: the most expressive structure fails on targets that a restricted alternative solves reliably. Switching…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.