# VeGA: A Versatile Generative Architecture for Bioactive Molecules across Multiple Therapeutic Targets

**Authors:** Pietro Delre, Antonio Lavecchia

PMC · DOI: 10.1021/acs.jcim.5c01606 · Journal of Chemical Information and Modeling · 2025-10-02

## TL;DR

VeGA is a new AI model for designing bioactive molecules that works well even with limited data and generates chemically realistic and novel compounds.

## Contribution

VeGA introduces a lightweight Transformer model for molecular design that excels in data-scarce target-specific fine-tuning and generates highly novel molecules.

## Key findings

- VeGA achieves high validity (96.6%) and novelty (93.6%) in molecular generation.
- It outperforms state-of-the-art models in generating novel molecules under low-data conditions.
- VeGA successfully generated FXR-targeted compounds with validated binding potential via molecular docking.

## Abstract

In this paper, we present VeGA, a lightweight, decoder-only
Transformer
model for de novo molecular design. VeGA balances
a streamlined architecture with robust generative performance, making
it highly efficient and well-suited for resource-limited environments.
Pretrained on ChEMBL, the model demonstrates strong performance against
cutting-edge approaches, achieving high validity (96.6%) and novelty
(93.6%), ranking among the top performers in the MOSES benchmark.
The model’s main strength lies in target-specific fine-tuning
under challenging, data-scarce conditions. In a rigorous, leakage-safe
evaluation across five pharmacological targets against state-of-the-art
models (S4, R4), VeGA proved to be a powerful “explorer”
that consistently generated the most novel molecules while maintaining
a strong balance between discovery performance and chemical realism.
This capability is particularly evident in the extremely low-data
scenario of mTORC1, where VeGA achieved top-tier results. As a case
study, VeGA was applied to the Farnesoid X receptor (FXR), generating
novel compounds with validated binding potential through molecular
docking. The model is available as an open-access platform to support
medicinal chemists in designing novel, target-specific chemotypes
(https://github.com/piedelre93/VeGA-for-de-novo-design). Future developments
will focus on incorporating conditioning strategies for multiobjective
optimization and integrating experimental in vitro validation workflows.

## Linked entities

- **Proteins:** Crtc (CREB-regulated transcription coactivator)

## Full-text entities

- **Genes:** NR1H4 (nuclear receptor subfamily 1 group H member 4) [NCBI Gene 9971] {aka BAR, FXR, HRR-1, HRR1, PFIC5, RIP14}
- **Chemicals:** VeGA (MESH:C518218)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12570142/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12570142/full.md

## References

74 references — full list in the complete paper: https://tomesphere.com/paper/PMC12570142/full.md

---
Source: https://tomesphere.com/paper/PMC12570142