Active Learning Improves Performance on Symbolic RegressionTasks in StackGP
Nathan Haut, Wolfgang Banzhaf, Bill Punch

TL;DR
This paper presents an active learning approach for symbolic regression with StackGP, which iteratively adds data points to improve model accuracy, successfully rediscovering many equations with fewer data points.
Contribution
Introduces an active learning method for symbolic regression that enhances data efficiency and model discovery using StackGP without domain knowledge.
Findings
Successfully rediscovered 72 out of 100 Feynman equations
Achieved high accuracy with fewer data points
Demonstrated effectiveness of active learning in symbolic regression
Abstract
In this paper we introduce an active learning method for symbolic regression using StackGP. The approach begins with a small number of data points for StackGP to model. To improve the model the system incrementally adds a data point such that the new point maximizes prediction uncertainty as measured by the model ensemble. Symbolic regression is re-run with the larger data set. This cycle continues until the system satisfies a termination criterion. We use the Feynman AI benchmark set of equations to examine the ability of our method to find appropriate models using fewer data points. The approach was found to successfully rediscover 72 of the 100 Feynman equations using as few data points as possible, and without use of domain expertise or data translation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Model Reduction and Neural Networks · Machine Learning and Algorithms
