From Data to Theory: Autonomous Large Language Model Agents for Materials Science

Samuel Onimpa Alfred; Veera Sundararaghavan

arXiv:2604.19789·cs.AI·April 23, 2026

From Data to Theory: Autonomous Large Language Model Agents for Materials Science

Samuel Onimpa Alfred, Veera Sundararaghavan

PDF

TL;DR

This paper introduces an autonomous LLM agent capable of developing, testing, and predicting materials science theories end-to-end, demonstrating success with known equations and proposing new relationships, while highlighting current limitations.

Contribution

The work presents a novel autonomous LLM framework for materials theory development that can generate, test, and refine equations without human input.

Findings

01

Successfully identified governing equations for well-known relationships.

02

Demonstrated ability to propose new predictive models.

03

Showed limitations in validation and potential for incorrect equations.

Abstract

We present an autonomous large language model (LLM) agent for end-to-end, data-driven materials theory development. The model can choose an equation form, generate and run its own code, and test how well the theory matches the data without human intervention. The framework combines step-by-step reasoning with expert-supplied tools, allowing the agent to adjust its approach as needed while keeping a clear record of its decisions. For well-established materials relationships such as the Hall-Petch equation and Paris law, the agent correctly identifies the governing equation and makes reliable predictions on new datasets. For more specialized relationships, such as Kuhn's equation for the HOMO-LUMO gap of conjugated molecules as a function of length, performance depends more strongly on the underlying model, with GPT-5 showing better recovery of the correct equation. Beyond known theories,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.