Accelerating active learning materials discovery with FAIR data and workflows: a case study for alloy melting temperatures
Mohnish Harwani, Juan C. Verduzco, Brian H. Lee, Alejandro Strachan

TL;DR
This paper demonstrates how leveraging FAIR data and workflows accelerates active learning in materials discovery, reducing simulation efforts by tenfold in alloy melting temperature optimization.
Contribution
It introduces a workflow that reuses FAIR-compliant data and simulations to significantly speed up active learning in alloy discovery.
Findings
Reduced simulations per alloy from 4 to 1.
Achieved 10x speedup in alloy melting temperature optimization.
Successfully reused prior data for faster convergence.
Abstract
Active learning (AL) is a powerful sequential optimization approach that has shown great promise in the discovery of new materials. However, a major challenge remains the acquisition of the initial data and the development of workflows to generate new data at each iteration. In this study, we demonstrate a significant speedup in an optimization task by reusing a published simulation workflow available for online simulations and its associated data repository, where the results of each workflow run are automatically stored. Both the workflow and its data follow FAIR (findable, accessible, interoperable, and reusable) principles using nanoHUB's infrastructure. The workflow employs molecular dynamics to calculate the melting temperature of multi-principal component alloys. We leveraged all prior data not only to develop an accurate machine learning model to start the sequential…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMineral Processing and Grinding · Machine Learning in Materials Science
