Validation of the Scientific Literature via Chemputation Augmented by Large Language Models
Sebastian Pagel, Michael Jirasek, Leroy Cronin

TL;DR
This paper presents an LLM-based workflow that automates the validation, translation, simulation, and physical execution of synthetic chemistry procedures from literature, enhancing reproducibility and scalability in chemical research.
Contribution
It introduces a novel autonomous workflow integrating LLMs and XDL code for end-to-end validation and execution of synthetic procedures on robotic systems.
Findings
Successfully executed four syntheses from literature using the workflow.
Demonstrated safe, secure, and scalable automation with XDL abstraction.
Enhanced reproducibility and validation of chemical procedures.
Abstract
Chemputation is the process of programming chemical robots to do experiments using a universal symbolic language, but the literature can be error prone and hard to read due to ambiguities. Large Language Models (LLMs) have demonstrated remarkable capabilities in various domains, including natural language processing, robotic control, and more recently, chemistry. Despite significant advancements in standardizing the reporting and collection of synthetic chemistry data, the automatic reproduction of reported syntheses remains a labour-intensive task. In this work, we introduce an LLM-based chemical research agent workflow designed for the automatic validation of synthetic literature procedures. Our workflow can autonomously extract synthetic procedures and analytical data from extensive documents, translate these procedures into universal XDL code, simulate the execution of the procedure…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies
