DrSR: LLM based Scientific Equation Discovery with Dual Reasoning from Data and Experience

Runxiang Wang; Boxiao Wang; Kai Li; Yifan Zhang; Jian Cheng

arXiv:2506.04282·cs.LG·June 6, 2025

DrSR: LLM based Scientific Equation Discovery with Dual Reasoning from Data and Experience

Runxiang Wang, Boxiao Wang, Kai Li, Yifan Zhang, Jian Cheng

PDF

Open Access

TL;DR

DrSR enhances scientific equation discovery by combining data analysis and reflective reasoning in LLMs, leading to more accurate and efficient symbolic regression across various scientific domains.

Contribution

This paper introduces DrSR, a dual reasoning framework that integrates data understanding with reflection to improve LLM-based symbolic regression.

Findings

01

DrSR outperforms existing methods in valid equation rate.

02

It improves accuracy, generalization, and search efficiency.

03

Demonstrated effectiveness across multiple scientific disciplines.

Abstract

Symbolic regression is a fundamental tool for discovering interpretable mathematical expressions from data, with broad applications across scientific and engineering domains. Recently, large language models (LLMs) have demonstrated strong performance in this task, leveraging embedded scientific priors and reasoning capabilities to surpass traditional methods. However, existing LLM-based approaches, such as LLM-SR, often over-rely on internal priors, lacking explicit data understanding and systematic reflection during equation generation. To address these limitations, we propose DrSR (Dual Reasoning Symbolic Regression), a framework that combines data-driven insight with reflective learning to enhance both robustness and discovery capability. Specifically, DrSR guides LLMs to analyze structural relationships (e.g., monotonicity, nonlinearity, and correlation) within the data to generate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Materials Science · Model Reduction and Neural Networks · Advanced Graph Neural Networks

MethodsSymbolic Regression Large Language Models