From Word2Vec to Transformers: Text-Derived Composition Embeddings for Filtering Combinatorial Electrocatalysts
Lei Zhang, Markus Stricker

TL;DR
This paper explores text-derived embeddings, comparing Word2Vec and transformer models, to efficiently filter and prioritize complex electrocatalyst compositions without relying on experimental labels.
Contribution
It introduces a label-free screening method using scientific text embeddings and compares linear and transformer-based encoding strategies for materials selection.
Findings
Word2Vec often outperforms transformer embeddings in candidate reduction.
Linear element-wise mixing provides a simple yet effective composition encoding.
The method successfully filters candidates across diverse materials libraries.
Abstract
Compositionally complex solid solution electrocatalysts span vast composition spaces, and even one materials system can contain more candidate compositions than can be measured exhaustively. Here we evaluate a label-free screening strategy that represents each composition using embeddings derived from scientific texts and prioritizes candidates based on similarity to two property concepts. We compare a corpus-trained Word2Vec baseline with transformer-based embeddings, where compositions are encoded either by linear element-wise mixing or by short composition prompts. Similarities to `concept directions', the terms conductivity and dielectric, define a 2-dimensional descriptor space, and a symmetric Pareto-front selection is used to filter candidate subsets without using electrochemical labels. Performance is assessed on 15 materials libraries including noble metal alloys and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Electrocatalysts for Energy Conversion · CO2 Reduction Techniques and Catalysts
