Analysing Symbolic Regression Benchmarks under a Meta-Learning Approach

Luiz Otavio Vilas Boas Oliveira; Joao Francisco Barreto da Silva; Martins; Luis Fernando Miranda; Gisele Lobo Pappa

arXiv:1805.10365·cs.NE·May 29, 2018

Analysing Symbolic Regression Benchmarks under a Meta-Learning Approach

Luiz Otavio Vilas Boas Oliveira, Joao Francisco Barreto da Silva, Martins, Luis Fernando Miranda, Gisele Lobo Pappa

PDF

TL;DR

This paper proposes a meta-learning approach to evaluate and improve symbolic regression benchmarks by analyzing dataset meta-features and their correlation with GP performance, revealing current benchmarks' limitations.

Contribution

It introduces a quantitative method using meta-features to assess and guide the selection of diverse datasets for symbolic regression benchmarking.

Findings

01

Current benchmarks are concentrated in a small region of the meta-feature space.

02

Number of instances and output skewness are key predictors of GP error.

03

Meta-features can guide the construction of more effective testbeds.

Abstract

The definition of a concise and effective testbed for Genetic Programming (GP) is a recurrent matter in the research community. This paper takes a new step in this direction, proposing a different approach to measure the quality of the symbolic regression benchmarks quantitatively. The proposed approach is based on meta-learning and uses a set of dataset meta-features---such as the number of examples or output skewness---to describe the datasets. Our idea is to correlate these meta-features with the errors obtained by a GP method. These meta-features define a space of benchmarks that should, ideally, have datasets (points) covering different regions of the space. An initial analysis of 63 datasets showed that current benchmarks are concentrated in a small region of this benchmark space. We also found out that number of instances and output skewness are the most relevant meta-features to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.