Validation of the Scientific Literature via Chemputation Augmented by   Large Language Models

Sebastian Pagel; Michael Jirasek; Leroy Cronin

arXiv:2410.06384·cs.AI·October 10, 2024·2 cites

Validation of the Scientific Literature via Chemputation Augmented by Large Language Models

Sebastian Pagel, Michael Jirasek, Leroy Cronin

PDF

Open Access

TL;DR

This paper presents an LLM-based workflow that automates the validation, translation, simulation, and physical execution of synthetic chemistry procedures from literature, enhancing reproducibility and scalability in chemical research.

Contribution

It introduces a novel autonomous workflow integrating LLMs and XDL code for end-to-end validation and execution of synthetic procedures on robotic systems.

Findings

01

Successfully executed four syntheses from literature using the workflow.

02

Demonstrated safe, secure, and scalable automation with XDL abstraction.

03

Enhanced reproducibility and validation of chemical procedures.

Abstract

Chemputation is the process of programming chemical robots to do experiments using a universal symbolic language, but the literature can be error prone and hard to read due to ambiguities. Large Language Models (LLMs) have demonstrated remarkable capabilities in various domains, including natural language processing, robotic control, and more recently, chemistry. Despite significant advancements in standardizing the reporting and collection of synthetic chemistry data, the automatic reproduction of reported syntheses remains a labour-intensive task. In this work, we introduce an LLM-based chemical research agent workflow designed for the automatic validation of synthetic literature procedures. Our workflow can autonomously extract synthetic procedures and analytical data from extensive documents, translate these procedures into universal XDL code, simulate the execution of the procedure…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiomedical Text Mining and Ontologies