MolecularRNN: Generating realistic molecular graphs with optimized properties
Mariya Popova, Mykhailo Shvets, Junier Oliva, Olexandr Isayev

TL;DR
MolecularRNN is a graph recurrent generative model that creates realistic, diverse molecular graphs optimized for specific properties, significantly outperforming previous methods in drug-likeness and other key metrics.
Contribution
The paper introduces MolecularRNN, a novel graph recurrent model for molecular generation that incorporates likelihood pretraining and reinforcement learning for property optimization.
Findings
Achieves 100% validity with rejection sampling.
Significantly improves property distributions for lipophilicity, drug-likeness, and melting point.
Outperforms state-of-the-art methods in property optimization.
Abstract
Designing new molecules with a set of predefined properties is a core problem in modern drug discovery and development. There is a growing need for de-novo design methods that would address this problem. We present MolecularRNN, the graph recurrent generative model for molecular structures. Our model generates diverse realistic molecular graphs after likelihood pretraining on a big database of molecules. We perform an analysis of our pretrained models on large-scale generated datasets of 1 million samples. Further, the model is tuned with policy gradient algorithm, provided a critic that estimates the reward for the property of interest. We show a significant distribution shift to the desired range for lipophilicity, drug-likeness, and melting point outperforming state-of-the-art works. With the use of rejection sampling based on valency constraints, our model yields 100% validity.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · Protein Structure and Dynamics
