# Crossover operators for molecular graphs with an application to virtual drug screening

**Authors:** Nico Domschke, Bruno J. Schmidt, Thomas Gatter, Richard Golnik, Paul Eisenhuth, Fabian Liessmann, Jens Meiler, Peter F. Stadler

PMC · DOI: 10.1186/s13321-025-00958-w · Journal of Cheminformatics · 2025-06-17

## TL;DR

This paper introduces a new genetic algorithm method for drug design by recombining molecular structures, improving diversity and effectiveness of candidate molecules.

## Contribution

The paper introduces cut-and-join crossover operators for molecular graphs, a novel and effective approach for recombination in drug design.

## Key findings

- Cut-and-join crossover preserves molecular properties like valency and planarity while generating plausible molecules.
- The method increases diversity of candidate molecules compared to initial libraries and maintains synthesizability indices.
- Using the method in REvoLd led to discovering molecules with better binding constants than existing ones.

## Abstract

Genetic algorithms are a powerful method to solve optimization problems with complex cost functions over vast search spaces that rely in particular on recombining parts of previous solutions. Crossover operators play a crucial role in this context. Here, we describe a large class of these operators designed for searching over spaces of graphs. These operators are based on introducing small cuts into graphs and rejoining the resulting induced subgraphs of two parents. This form of cut-and-join crossover can be restricted in a consistent way to preserve local properties such as vertex-degrees (valency), or bond-orders, as well as global properties such as graph-theoretic planarity. In contrast to crossover on strings, cut-and-join crossover on graphs is powerful enough to ergodically explore chemical space even in the absence of mutation operators. Extensive benchmarking shows that the offspring of molecular graphs are again plausible molecules with high probability, while at the same time crossover drastically increases the diversity compared to initial molecule libraries. Moreover, desirable properties such as favorable indices of synthesizability are preserved with sufficient frequency that candidate offsprings can be filtered efficiently for such properties. As an application we utilized the cut-and-join crossover in REvoLd, a GA-based system for computer-aided drug design. In optimization runs searching for ligands binding to four different target proteins we consistently found candidate molecules with binding constants exceeding the best known binders as well as candidates found in make-on-demand libraries.

Scientific contribution

We define cut-and-join crossover operators on a variety of graph classes including molecular graphs. This constitutes a mathematically simple and well-characterized approach to recombination of molecules that performed very well in real-life CADD tasks.

The online version contains supplementary material available at 10.1186/s13321-025-00958-w.

## Full-text entities

- **Diseases:** CADD (MESH:C000719218), STONED (MESH:D007669)
- **Chemicals:** ether (MESH:D004986), peroxides (MESH:D010545), H (MESH:D006859), anhydrides (MESH:D000812), sulfur (MESH:D013455), naphthalene (MESH:C031721), allyl alcohol (MESH:C006463), carbon (MESH:D002244), ILP (-), allenes (MESH:C025947), GAs (MESH:D005708), CTP (MESH:D003570), oxygen (MESH:D010100), water (MESH:D014867)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12175394/full.md

## Figures

17 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12175394/full.md

## References

6 references — full list in the complete paper: https://tomesphere.com/paper/PMC12175394/full.md

---
Source: https://tomesphere.com/paper/PMC12175394