DrSR: LLM based Scientific Equation Discovery with Dual Reasoning from Data and Experience
Runxiang Wang, Boxiao Wang, Kai Li, Yifan Zhang, Jian Cheng

TL;DR
DrSR enhances scientific equation discovery by combining data analysis and reflective reasoning in LLMs, leading to more accurate and efficient symbolic regression across various scientific domains.
Contribution
This paper introduces DrSR, a dual reasoning framework that integrates data understanding with reflection to improve LLM-based symbolic regression.
Findings
DrSR outperforms existing methods in valid equation rate.
It improves accuracy, generalization, and search efficiency.
Demonstrated effectiveness across multiple scientific disciplines.
Abstract
Symbolic regression is a fundamental tool for discovering interpretable mathematical expressions from data, with broad applications across scientific and engineering domains. Recently, large language models (LLMs) have demonstrated strong performance in this task, leveraging embedded scientific priors and reasoning capabilities to surpass traditional methods. However, existing LLM-based approaches, such as LLM-SR, often over-rely on internal priors, lacking explicit data understanding and systematic reflection during equation generation. To address these limitations, we propose DrSR (Dual Reasoning Symbolic Regression), a framework that combines data-driven insight with reflective learning to enhance both robustness and discovery capability. Specifically, DrSR guides LLMs to analyze structural relationships (e.g., monotonicity, nonlinearity, and correlation) within the data to generate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Model Reduction and Neural Networks · Advanced Graph Neural Networks
MethodsSymbolic Regression Large Language Models
