Large Language Model Agent for Modular Task Execution in Drug Discovery
Janghoon Ock, Radheesh Sharma Meda, Srivathsan Badrinarayanan, Neha S. Aluru, Achuth Chandrasekhar, and Amir Barati Farimani

TL;DR
This paper introduces a modular LLM-based framework that automates key tasks in early drug discovery, including data retrieval, molecular generation, property prediction, and structure modeling, improving efficiency and accuracy.
Contribution
It presents a novel modular framework combining LLM reasoning with domain tools for comprehensive drug discovery tasks, enabling flexible integration and iterative molecular refinement.
Findings
Increased molecules with QED > 0.6 from 34 to 55 after refinement rounds.
Improved compliance with drug-likeness filters, e.g., Ghose filter from 32 to 55.
Generated 3D protein-ligand complexes with rapid binding affinity estimates.
Abstract
We present a modular framework powered by large language models (LLMs) that automates and streamlines key tasks across the early-stage computational drug discovery pipeline. By combining LLM reasoning with domain-specific tools, the framework performs biomedical data retrieval, literature-grounded question answering via retrieval-augmented generation, molecular generation, multi-property prediction, property-aware molecular refinement, and 3D protein-ligand structure generation. The agent autonomously retrieved relevant biomolecular information, including FASTA sequences, SMILES representations, and literature, and answered mechanistic questions with improved contextual accuracy compared to standard LLMs. It then generated chemically diverse seed molecules and predicted 75 properties, including ADMET-related and general physicochemical descriptors, which guided iterative molecular…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetics, Bioinformatics, and Biomedical Research · Topic Modeling · Computational Drug Discovery Methods
