Discovering Mathematical Formulas from Data via GPT-guided Monte Carlo Tree Search
Yanjie Li, Weijun Li, Lina Yu, Min Wu, Jingyi Liu, Wenqiang Li, Meilan, Hao, Shu Wei, Yusong Deng

TL;DR
This paper introduces SR-GPT, a novel symbolic regression algorithm that combines Monte Carlo Tree Search with a Generative Pre-Trained Transformer, significantly improving search efficiency and accuracy in recovering mathematical formulas from data.
Contribution
The paper presents a new method integrating GPT with MCTS for symbolic regression, enhancing search efficiency and expression recovery over previous approaches.
Findings
SR-GPT outperforms existing algorithms in accuracy.
It effectively recovers expressions with noisy data.
The method demonstrates robustness across diverse datasets.
Abstract
Finding a concise and interpretable mathematical formula that accurately describes the relationship between each variable and the predicted value in the data is a crucial task in scientific research, as well as a significant challenge in artificial intelligence. This problem is referred to as symbolic regression, which is an NP-hard problem. In the previous year, a novel symbolic regression methodology utilizing Monte Carlo Tree Search (MCTS) was advanced, achieving state-of-the-art results on a diverse range of datasets. although this algorithm has shown considerable improvement in recovering target expressions compared to previous methods, the lack of guidance during the MCTS process severely hampers its search efficiency. Recently, some algorithms have added a pre-trained policy network to guide the search of MCTS, but the pre-trained policy network generalizes poorly. To optimize…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing · Machine Learning and Data Classification · Statistics Education and Methodologies
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Label Smoothing · Absolute Position Encodings · Linear Warmup With Cosine Annealing · Linear Layer · Dropout · Multi-Head Attention · Residual Connection
