# Large Language Model Agent for Modular Task Execution in Drug Discovery

**Authors:** Janghoon Ock, Radheesh Sharma Meda, Srivathsan Badrinarayanan, Neha S. Aluru, Achuth Chandrasekhar, Amir Barati Farimani

PMC · DOI: 10.1021/acs.jcim.5c02454 · Journal of Chemical Information and Modeling · 2026-02-09

## TL;DR

This paper introduces an AI-powered framework that automates drug discovery tasks like molecule generation and property prediction using large language models and specialized tools.

## Contribution

A modular framework combining LLMs and domain-specific tools for end-to-end drug discovery task automation.

## Key findings

- The framework improved molecule quality, increasing QED >0.6 molecules from 34 to 55 after two refinement rounds.
- Compliance with drug-likeness filters like Ghose increased from 32 to 55 in a pool of 100 molecules.
- 3D protein–ligand structures and binding affinity estimates were generated using Boltz-2.

## Abstract

We present a modular
framework powered by large language
models
(LLMs) that automates and streamlines key tasks across the early stage
computational drug discovery pipeline. By combining LLM reasoning
with domain-specific tools, the framework performs biomedical data
retrieval, literature-grounded question answering via retrieval-augmented
generation, molecular generation, multiproperty prediction, property-aware
molecular refinement, and 3D protein–ligand structure generation.
The agent autonomously retrieves relevant biomolecular information,
including FASTA sequences, SMILES representations, and literature,
and answers mechanistic questions with improved contextual accuracy
compared to standard LLMs. It then generates chemically diverse seed
molecules and predicted 75 properties, including ADMET-related and
general physicochemical descriptors, which guids iterative molecular
refinement. Across two refinement rounds, the number of molecules
with QED >0.6 increased from 34 to 55. The number of molecules
satisfying
empirical drug-likeness filters also rose; for example, compliance
with the Ghose filter increased from 32 to 55 within a pool of 100
molecules. The framework also employed Boltz-2 to generate 3D protein–ligand
complexes and provide rapid binding affinity estimates for candidate
compounds. These results demonstrate that the approach effectively
supports molecular screening, prioritization, and structure evaluation.
Its modular design enables flexible integration of evolving tools
and models, providing a scalable foundation for AI-assisted therapeutic
discovery.

## Full-text entities

- **Genes:** TP53 (tumor protein p53) [NCBI Gene 7157] {aka BCC7, BMFS5, LFS1, P53, TRP53}, MCL1 (MCL1 apoptosis regulator, BCL2 family member) [NCBI Gene 4170] {aka BCL2L3, EAT, MCL1-ES, MCL1L, MCL1S, Mcl-1}, BCL2L11 (BCL2 like 11) [NCBI Gene 10018] {aka BAM, BIM, BOD}, PMAIP1 (phorbol-12-myristate-13-acetate-induced protein 1) [NCBI Gene 5366] {aka APR, NOXA}, EIF2A (eukaryotic translation initiation factor 2A) [NCBI Gene 83939] {aka CDA02, EIF-2A, MST089, MSTP004, MSTP089}, EGFR (epidermal growth factor receptor) [NCBI Gene 1956] {aka ERBB, ERBB1, ERRP, HER1, NISBD2, NNCIS}, F2 (coagulation factor II, thrombin) [NCBI Gene 2147] {aka PT, RPRGL2, THPH1}, JAK2 (Janus kinase 2) [NCBI Gene 3717] {aka JTK10}, KCNH2 (potassium voltage-gated channel subfamily H member 2) [NCBI Gene 3757] {aka ERG-1, ERG1, H-ERG, HERG, HERG1, Kv11.1}, BAK1 (BCL2 antagonist/killer 1) [NCBI Gene 578] {aka BAK, BAK-LIKE, BCL2L7, CDN1}, ATF3 (activating transcription factor 3) [NCBI Gene 467], BAX (BCL2 associated X, apoptosis regulator) [NCBI Gene 581] {aka BCL2L4}, ATF4 (activating transcription factor 4) [NCBI Gene 468] {aka CREB-2, CREB2, TAXREB67, TXREB}, BCL2 (BCL2 apoptosis regulator) [NCBI Gene 596] {aka Bcl-2, PPP1R50}
- **Diseases:** LLMs (MESH:D007806), liver injury I (MESH:D017093), Myelofibrosis (MESH:D055728), lymphocytic leukemia (MESH:D007945), liver toxicity (MESH:D056486), Toxicity (MESH:D064420), Thrombosis (MESH:D013927)
- **Chemicals:** hydrogen (MESH:D006859), halogens (MESH:D006219), Lead (MESH:D007854), AMES (MESH:C017501), sulfonamide (MESH:D013449), ABT-199 (MESH:C579720), ChEMBL (-), oxygen (MESH:D010100)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Mutations:** N501Y

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12933718/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12933718/full.md

## References

82 references — full list in the complete paper: https://tomesphere.com/paper/PMC12933718/full.md

---
Source: https://tomesphere.com/paper/PMC12933718