LLM4Mat-Bench: Benchmarking Large Language Models for Materials Property   Prediction

Andre Niyongabo Rubungo; Kangming Li; Jason Hattrick-Simpers; Adji; Bousso Dieng

arXiv:2411.00177·cond-mat.mtrl-sci·December 3, 2024·6 cites

LLM4Mat-Bench: Benchmarking Large Language Models for Materials Property Prediction

Andre Niyongabo Rubungo, Kangming Li, Jason Hattrick-Simpers, Adji, Bousso Dieng

PDF

Open Access 1 Repo

TL;DR

This paper introduces LLM4Mat-Bench, the largest benchmark for evaluating large language models in predicting properties of crystalline materials, highlighting current challenges and the need for specialized models.

Contribution

It provides a comprehensive benchmark dataset with diverse modalities and evaluates various LLMs, revealing limitations and guiding future development in materials property prediction.

Findings

01

General-purpose LLMs face challenges in materials prediction

02

Task-specific models outperform general LLMs

03

Benchmark facilitates standardized evaluation in the field

Abstract

Large language models (LLMs) are increasingly being used in materials science. However, little attention has been given to benchmarking and standardized evaluation for LLM-based materials property prediction, which hinders progress. We present LLM4Mat-Bench, the largest benchmark to date for evaluating the performance of LLMs in predicting the properties of crystalline materials. LLM4Mat-Bench contains about 1.9M crystal structures in total, collected from 10 publicly available materials data sources, and 45 distinct properties. LLM4Mat-Bench features different input modalities: crystal composition, CIF, and crystal text description, with 4.7M, 615.5M, and 3.1B tokens in total for each modality, respectively. We use LLM4Mat-Bench to fine-tune models with different sizes, including LLM-Prop and MatBERT, and provide zero-shot and few-shot prompts to evaluate the property prediction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vertaix/llm4mat-bench
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Materials Science · Manufacturing Process and Optimization

MethodsSoftmax · Attention Is All You Need