Generating Focussed Molecule Libraries for Drug Discovery with Recurrent Neural Networks
Marwin H.S. Segler, Thierry Kogej, Christian Tyrchan, Mark P. Waller

TL;DR
This paper demonstrates that recurrent neural networks can generate novel, target-specific molecules for drug discovery, with the ability to reproduce known active compounds and facilitate de novo design.
Contribution
It introduces a method to fine-tune RNNs for generating biologically active molecules, advancing computational drug design techniques.
Findings
Reproduced 14% of test molecules for S. aureus.
Reproduced 28% of test molecules for P. falciparum.
Coupled with scoring, the model enables complete de novo drug design.
Abstract
In de novo drug design, computational strategies are used to generate novel molecules with good affinity to the desired biological target. In this work, we show that recurrent neural networks can be trained as generative models for molecular structures, similar to statistical language models in natural language processing. We demonstrate that the properties of the generated molecules correlate very well with the properties of the molecules used to train the model. In order to enrich libraries with molecules active towards a given biological target, we propose to fine-tune the model with small sets of molecules, which are known to be active against that target. Against Staphylococcus aureus, the model reproduced 14% of 6051 hold-out test molecules that medicinal chemists designed, whereas against Plasmodium falciparum (Malaria) it reproduced 28% of 1240 test molecules. When coupled…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · Protein Structure and Dynamics
