MatClaw: An Autonomous Code-First LLM Agent for End-to-End Materials Exploration

Chenmu Zhang; Boris I. Yakobson

arXiv:2604.02688·cond-mat.mtrl-sci·April 20, 2026

MatClaw: An Autonomous Code-First LLM Agent for End-to-End Materials Exploration

Chenmu Zhang, Boris I. Yakobson

PDF

1 Repo

TL;DR

MatClaw is an autonomous LLM-based agent that writes and executes Python code for materials science workflows on HPC clusters, using a multi-layer memory architecture and retrieval-augmented generation to improve accuracy and reliability.

Contribution

The paper introduces MatClaw, a code-first LLM agent capable of orchestrating complex multi-code workflows without predefined tool functions, enhancing autonomous materials research.

Findings

01

MatClaw achieves ~99% API-call accuracy in code generation.

02

Demonstrated on ferroelectric materials with active learning and parameter search.

03

Bridges knowledge gaps with literature self-learning and expert constraints.

Abstract

Existing LLM agents for computational materials science are constrained by pipeline-bounded architectures tied to specific simulation codes and by dependence on manually written tool functions that grow with task scope. We present MatClaw, a code-first agent that writes and executes Python directly, composing any installed domain library to orchestrate multi-code workflows on remote HPC clusters without predefined tool functions. To sustain coherent execution across multi-day workflows, MatClaw uses a four-layer memory architecture that prevents progressive context loss, and retrieval-augmented generation over domain source code that raises per-step API-call accuracy to $\sim$ 99 %. Three end-to-end demonstrations on ferroelectric CuInP2S6 (machine-learning force field training via active learning, Curie temperature prediction, and heuristic parameter-space search) reveal that the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

null
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.