CACTUS: Chemistry Agent Connecting Tool-Usage to Science
Andrew D. McNaughton, Gautham Ramalaxmi, Agustin Kruel, Carter R., Knutson, Rohith A. Varikoti, Neeraj Kumar

TL;DR
CACTUS is an innovative LLM-based agent that integrates cheminformatics tools to enhance reasoning and problem-solving in chemistry, outperforming baseline models and enabling advanced molecular discovery tasks.
Contribution
This paper introduces CACTUS, a novel framework combining open-source LLMs with domain-specific tools for improved chemistry research and molecular discovery.
Findings
CACTUS significantly outperforms baseline LLMs on chemistry questions.
Prompt engineering and hardware configurations impact model performance.
Smaller models can be effectively deployed on consumer hardware without major accuracy loss.
Abstract
Large language models (LLMs) have shown remarkable potential in various domains, but they often lack the ability to access and reason over domain-specific knowledge and tools. In this paper, we introduced CACTUS (Chemistry Agent Connecting Tool-Usage to Science), an LLM-based agent that integrates cheminformatics tools to enable advanced reasoning and problem-solving in chemistry and molecular discovery. We evaluate the performance of CACTUS using a diverse set of open-source LLMs, including Gemma-7b, Falcon-7b, MPT-7b, Llama2-7b, and Mistral-7b, on a benchmark of thousands of chemistry questions. Our results demonstrate that CACTUS significantly outperforms baseline LLMs, with the Gemma-7b and Mistral-7b models achieving the highest accuracy regardless of the prompting strategy used. Moreover, we explore the impact of domain-specific prompting and hardware configurations on model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVarious Chemistry Research Topics
MethodsSparse Evolutionary Training
