Correlation versus RMSE Loss Functions in Symbolic Regression Tasks
Nathan Haut, Wolfgang Banzhaf, Bill Punch

TL;DR
This paper investigates the effectiveness of using correlation as a fitness function in symbolic regression, demonstrating it outperforms RMSE by requiring fewer generations and data points for accurate solutions.
Contribution
It introduces the use of correlation with an alignment step as a fitness function in symbolic regression, showing significant performance improvements over traditional RMSE.
Findings
Correlation fitness leads to faster solutions in fewer generations.
Fewer data points are needed when using correlation as fitness.
Performance gains are validated on benchmark problems.
Abstract
The use of correlation as a fitness function is explored in symbolic regression tasks and the performance is compared against the typical RMSE fitness function. Using correlation with an alignment step to conclude the evolution led to significant performance gains over RMSE as a fitness function. Using correlation as a fitness function led to solutions being found in fewer generations compared to RMSE, as well it was found that fewer data points were needed in the training set to discover the correct equations. The Feynman Symbolic Regression Benchmark as well as several other old and recent GP benchmark problems were used to evaluate performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Algorithms and Applications · Metaheuristic Optimization Algorithms Research
