Similarity of Precursors in Solid-state Synthesis as Text-Mined from Scientific Literature
Tanjin He, Wenhao Sun, Haoyan Huo, Olga Kononova, Ziqin Rong, Vahe, Tshitoyan, Tiago Botari, Gerbrand Ceder

TL;DR
This paper presents a text-mining approach to identify and analyze precursor similarities in solid-state synthesis literature, enabling better understanding and prediction of materials synthesis processes.
Contribution
Developed a two-step chemical named entity recognition model and a hierarchical clustering method to quantify precursor similarities from scientific texts.
Findings
Chemical similarity of precursors can be extracted from literature.
Hierarchical clustering reveals precursor relationships.
Quantifying precursor similarity aids in predictive synthesis modeling.
Abstract
Collecting and analyzing the vast amount of information available in the solid-state chemistry literature may accelerate our understanding of materials synthesis. However, one major problem is the difficulty of identifying which materials from a synthesis paragraph are precursors or are target materials. In this study, we developed a two-step Chemical Named Entity Recognition (CNER) model to identify precursors and targets, based on information from the context around material entities. Using the extracted data, we conducted a meta-analysis to study the similarities and differences between precursors in the context of solid-state synthesis. To quantify precursor similarity, we built a substitution model to calculate the viability of substituting one precursor with another while retaining the target. From a hierarchical clustering of the precursors, we demonstrate that "chemical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
