Data-Informed Model Complexity Metric for Optimizing Symbolic Regression Models
Nathan Haut, Zenas Huang, and Adam Alessio

TL;DR
This paper proposes a data-informed complexity metric for symbolic regression models that uses Hessian rank and intrinsic dimensionality estimators to select models with optimal complexity, improving generalization.
Contribution
It introduces a novel complexity estimation method based on Hessian rank and intrinsic dimensionality, aligning model complexity with data complexity for better model selection.
Findings
The method accurately estimates model complexity using Hessian rank.
Aligning model complexity with data intrinsic dimensionality improves generalization.
The approach reduces bias compared to traditional parsimony-based methods.
Abstract
Choosing models from a well-fitted evolved population that generalizes beyond training data is difficult. We introduce a pragmatic method to estimate model complexity using Hessian rank for post-processing selection. Complexity is approximated by averaging the model output Hessian rank across a few points (N=3), offering efficient and accurate rank estimates. This method aligns model selection with input data complexity, calculated using intrinsic dimensionality (ID) estimators. Using the StackGP system, we develop symbolic regression models for the Penn Machine Learning Benchmark and employ twelve scikit-dimension library methods to estimate ID, aligning model expressiveness with dataset ID. Our data-informed complexity metric finds the ideal complexity window, balancing model expressiveness and accuracy, enhancing generalizability without bias common in methods reliant on user-defined…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Evolutionary Algorithms and Applications
MethodsLib
