# MolOrgGPT: De Novo Generation via Large Language Models and Reinforcement Learning

**Authors:** Pablo Varas Pardo, Oscar Toledano, Guillermo Marcos-Ayuso, David Quesada, Nuria E. Campillo

PMC · DOI: 10.1021/acs.jcim.5c02400 · Journal of Chemical Information and Modeling · 2026-01-07

## TL;DR

This paper introduces a new system that uses AI to design new drug-like molecules, focusing on those that could help treat Alzheimer's disease.

## Contribution

A novel framework combining large language models and reinforcement learning for de novo molecule generation with specific biological targets.

## Key findings

- The system generates structurally diverse and synthetically accessible compounds.
- Molecules targeting Alzheimer's proteins showed favorable binding affinities and interactions.
- Top-ranked molecules demonstrated drug-like properties suitable for early drug discovery.

## Abstract

We present a general framework for the de novo design of small molecules with desirable chemical properties, developed
to aid the creation of novel chemical entities with potential therapeutic
use. The system is built upon a foundational Large Language Model
trained on a large comprehensive chemical database capable of generating
structurally diverse and synthetically accessible compounds. It is
then fine-tuned through reinforcement learning to enhance its capacity
to generate molecules tailored to specific biological targets. As
a case study, we apply this framework to design molecules targeting
key proteins involved in Alzheimer’s disease. The generated
compounds underwent molecular docking studies to assess their binding
affinities and prioritize candidates with optimal predicted interactions.
The top-ranked molecules were further analyzed based on their binding
modes and key molecular interactions with the target proteins. The
results suggest that our generative model produces viable, drug-like
molecules with favorable interactions, underscoring its potential
as a valuable tool in early stage drug discovery.

## Linked entities

- **Diseases:** Alzheimer’s disease (MONDO:0004975)

## Full-text entities

- **Diseases:** Alzheimer's disease (MESH:D000544)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12848977/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12848977/full.md

## References

98 references — full list in the complete paper: https://tomesphere.com/paper/PMC12848977/full.md

---
Source: https://tomesphere.com/paper/PMC12848977