Sampling Latent Material-Property Information From LLM-Derived Embedding   Representations

Luke P. J. Gilligan; Matteo Cobelli; Hasan M. Sayeed; Taylor D. Sparks; and Stefano Sanvito

arXiv:2409.11971·cs.CL·September 19, 2024

Sampling Latent Material-Property Information From LLM-Derived Embedding Representations

Luke P. J. Gilligan, Matteo Cobelli, Hasan M. Sayeed, Taylor D. Sparks, and Stefano Sanvito

PDF

Open Access

TL;DR

This paper explores how large language model-derived embeddings can capture latent material-property information, assessing their potential to inform materials science predictions without additional training.

Contribution

It demonstrates that LLM embeddings can reflect certain material properties, highlighting the importance of context and comparison methods for effective extraction.

Findings

01

LLM embeddings can encode some material property information

02

Optimal contextual clues are necessary for extracting meaningful data

03

LLMs have potential for generating useful materials representations

Abstract

Vector embeddings derived from large language models (LLMs) show promise in capturing latent information from the literature. Interestingly, these can be integrated into material embeddings, potentially useful for data-driven predictions of materials properties. We investigate the extent to which LLM-derived vectors capture the desired information and their potential to provide insights into material properties without additional training. Our findings indicate that, although LLMs can be used to generate representations reflecting certain property information, extracting the embeddings requires identifying the optimal contextual clues and appropriate comparators. Despite this restriction, it appears that LLMs still have the potential to be useful in generating meaningful materials-science representations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing and 3D Reconstruction · Welding Techniques and Residual Stresses · Non-Destructive Testing Techniques