Quantifying the performance of machine learning models in materials discovery
Christopher K. H. Borg, Eric S. Muckley, Clara Nyby, James E. Saal,, Logan Ward, Apurva Mehta, Bryce Meredig

TL;DR
This paper investigates the disconnect between traditional error metrics and actual success in guiding materials discovery using machine learning, proposing new metrics like Discovery Yield and Discovery Probability for better assessment.
Contribution
The study demonstrates that static error metrics do not reliably predict ML model effectiveness in materials discovery and introduces dynamic metrics for better evaluation.
Findings
Traditional error metrics do not correlate with discovery success.
Uncertainty estimates improve sequential learning performance.
New metrics like Discovery Yield and Discovery Probability better capture discovery success.
Abstract
The predictive capabilities of machine learning (ML) models used in materials discovery are typically measured using simple statistics such as the root-mean-square error (RMSE) or the coefficient of determination () between ML-predicted materials property values and their known values. A tempting assumption is that models with low error should be effective at guiding materials discovery, and conversely, models with high error should give poor discovery performance. However, we observe that no clear connection exists between a "static" quantity averaged across an entire training set, such as RMSE, and an ML property model's ability to dynamically guide the iterative (and often extrapolative) discovery of novel materials with targeted properties. In this work, we simulate a sequential learning (SL)-guided materials discovery process and demonstrate a decoupling between traditional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Geochemistry and Geologic Mapping · X-ray Diffraction in Crystallography
