What do Large Language Models know about materials?
Adrian Ehrenhofer, Thomas Wallmersperger, Gianaurelio Cuniberti

TL;DR
This paper investigates the factual knowledge of large language models about materials, focusing on the periodic table, to assess their potential and limitations in materials science applications.
Contribution
It introduces a benchmark for evaluating LLMs' material knowledge and analyzes their ability to generate correct information about the periodic table.
Findings
LLMs can generate factually correct information about the periodic table.
Vocabulary and tokenization significantly affect LLMs' material knowledge.
A benchmark helps determine where LLMs are applicable in the PSPP chain.
Abstract
Large Language Models (LLMs) are increasingly applied in the fields of mechanical engineering and materials science. As models that establish connections through the interface of language, LLMs can be applied for step-wise reasoning through the Processing-Structure-Property-Performance chain of material science and engineering. Current LLMs are built for adequately representing a dataset, which is the most part of the accessible internet. However, the internet mostly contains non-scientific content. If LLMs should be applied for engineering purposes, it is valuable to investigate models for their intrinsic knowledge -- here: the capacity to generate correct information about materials. In the current work, for the example of the Periodic Table of Elements, we highlight the role of vocabulary and tokenization for the uniqueness of material fingerprints, and the LLMs' capabilities of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
