# PROFIS: Design of Target-Focused Libraries by Probing Continuous Fingerprint Space with Recurrent Neural Networks

**Authors:** Hubert Rybka, Tomasz Danel, Sabina Podlewska

PMC · DOI: 10.1021/acs.jcim.5c00698 · Journal of Chemical Information and Modeling · 2025-04-28

## TL;DR

PROFIS is a new method using neural networks to design novel compound libraries focused on specific drug targets, enabling discovery of new chemical structures.

## Contribution

PROFIS introduces a generative model using RNNs and Bayesian optimization to design target-focused compound libraries with scaffold-hopping capabilities.

## Key findings

- PROFIS successfully generates novel ligands for the dopamine D2 receptor.
- The model outperforms others in scaffold-hopping, exploring new chemical spaces.
- The protocol is versatile and applicable to any biological target with known ligand data.

## Abstract

This study introduces
PROFIS, a new generative model
capable of
the design of structurally novel and target-focused compound libraries.
The model relies on a recurrent neural network that was trained to
decode embedded molecular fingerprints into SMILES strings. To identify
potential novel ligands, a biological activity predictor is first
trained on the low-dimensional fingerprint embedding space, enabling
the identification of high-activity subspaces for a given drug target.
The search for latent representations that are expected to yield active
structures upon decoding to SMILES is conducted with a Bayesian optimization
algorithm. We present the rationale for using SMILES as the output
notation of the recurrent neural network and compare its performance
with models trained to decode DeepSMILES and SELFIES strings. The
paper demonstrates the application of this protocol to generate candidate
ligands of the dopamine D2 receptor. It also emphasizes
the effectiveness of our approach in scaffold-hopping, which is valuable
for designing ligands outside the already explored chemical space.
We present how passing engineered molecular fingerprints through PROFIS
network can be utilized to generate diverse libraries of analogs for
a drug molecule of choice. It is worth noting that the protocol is
versatile and it can be employed for any biological target, given
the availability of a dataset containing known ligands. The potential
for widespread use of PROFIS is secured by scripts shared by the authors
on GitHub.

## Full-text entities

- **Genes:** DRD2 (dopamine receptor D2) [NCBI Gene 1813] {aka D2DR, D2R}
- **Chemicals:** PROFIS (-)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12076512/full.md

## Figures

14 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12076512/full.md

## References

44 references — full list in the complete paper: https://tomesphere.com/paper/PMC12076512/full.md

---
Source: https://tomesphere.com/paper/PMC12076512