(Exhaustive) Symbolic Regression and model selection by minimum description length
Harry Desmond

TL;DR
This paper introduces an exhaustive symbolic regression method using minimum description length for model selection, demonstrating its effectiveness on astrophysics problems and outperforming existing standards.
Contribution
Proposes an exhaustive search approach combined with MDL-based model selection for symbolic regression, addressing traditional algorithms' failure rates and ambiguity.
Findings
Identifies functions superior to literature standards in astrophysics problems
Demonstrates the effectiveness of the exhaustive symbolic regression algorithm
Provides publicly available tool for scientific and general use
Abstract
Symbolic regression is the machine learning method for learning functions from data. After a brief overview of the symbolic regression landscape, I will describe the two main challenges that traditional algorithms face: they have an unknown (and likely significant) probability of failing to find any given good function, and they suffer from ambiguity and poorly-justified assumptions in their function-selection procedure. To address these I propose an exhaustive search and model selection by the minimum description length principle, which allows accuracy and complexity to be directly traded off by measuring each in units of information. I showcase the resulting publicly available Exhaustive Symbolic Regression algorithm on three open problems in astrophysics: the expansion history of the universe, the effective behaviour of gravity in galaxies and the potential of the inflaton field. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
