Molecular Generation with Recurrent Neural Networks (RNNs)
Esben Jannik Bjerrum, Richard Threlfall

TL;DR
This paper demonstrates that recurrent neural networks with LSTM cells can generate novel, chemically sensible molecules by learning from existing compounds encoded as SMILES, aiding virtual drug library creation.
Contribution
It shows that RNNs can learn chemical rules and generate synthesizable molecules, with properties matching training data, advancing AI-driven drug discovery methods.
Findings
RNNs can generate chemically sensible molecules.
Generated molecules have similar properties to training data.
Most generated compounds are synthesizable according to assessments.
Abstract
The potential number of drug like small molecules is estimated to be between 10^23 and 10^60 while current databases of known compounds are orders of magnitude smaller with approximately 10^8 compounds. This discrepancy has led to an interest in generating virtual libraries using hand crafted chemical rules and fragment based methods to cover a larger area of chemical space and generate chemical libraries for use in in silico drug discovery endeavors. Here it is explored to what extent a recurrent neural network with long short term memory cells can figure out sensible chemical rules and generate synthesizable molecules by being trained on existing compounds encoded as SMILES. The networks can to a high extent generate novel, but chemically sensible molecules. The properties of the molecules are tuned by training on two different datasets consisting of fragment like molecules and drug…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · Various Chemistry Research Topics
