Benchmarking large language models for materials synthesis: the case of atomic layer deposition
Angel Yanguas-Gil, Matthew T. Dearing, Jeffrey W. Elam, Jessica C. Jones, Sungjoon Kim, Adnan Mohammad, Chi Thang Nguyen, Bratin Sengupta

TL;DR
This paper introduces ALDbench, a benchmark for evaluating large language models in materials synthesis, specifically atomic layer deposition, revealing strengths and weaknesses of models like GPT-4o.
Contribution
The paper presents ALDbench, an open-ended benchmark for assessing LLMs in atomic layer deposition, including human expert review and analysis of model responses.
Findings
GPT-4o scored 3.7/5 on overall quality.
36% of questions had at least one below-average response.
Significant correlations found between question difficulty and response quality.
Abstract
In this work we introduce an open-ended question benchmark, ALDbench, to evaluate the performance of large language models (LLMs) in materials synthesis, and in particular in the field of atomic layer deposition, a thin film growth technique used in energy applications and microelectronics. Our benchmark comprises questions with a level of difficulty ranging from graduate level to domain expert current with the state of the art in the field. Human experts reviewed the questions along the criteria of difficulty and specificity, and the model responses along four different criteria: overall quality, specificity, relevance, and accuracy. We ran this benchmark on an instance of OpenAI's GPT-4o. The responses from the model received a composite quality score of 3.7 on a 1 to 5 scale, consistent with a passing grade. However, 36% of the questions received at least one below average score. An…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Ferroelectric and Negative Capacitance Devices · Semiconductor materials and devices
