Improving Molecule Properties Through 2-Stage VAE
Chenghui Zhou, Barnabas Poczos

TL;DR
This paper introduces a two-stage VAE approach to enhance molecule property generation by better manifold recovery, demonstrating significant improvements on drug and polymer datasets.
Contribution
The paper proposes a novel 2-stage VAE architecture that improves manifold recovery and data similarity in molecule generation tasks.
Findings
Significant improvement in property statistics on ChEMBL dataset
Enhanced data similarity compared to single-stage VAE
Effective in polymer datasets as well
Abstract
Variational autoencoder (VAE) is a popular method for drug discovery and there had been a great deal of architectures and pipelines proposed to improve its performance. But the VAE model itself suffers from deficiencies such as poor manifold recovery when data lie on low-dimensional manifold embedded in higher dimensional ambient space and they manifest themselves in each applications differently. The consequences of it in drug discovery is somewhat under-explored. In this paper, we study how to improve the similarity of the data generated via VAE and the training dataset by improving manifold recovery via a 2-stage VAE where the second stage VAE is trained on the latent space of the first one. We experimentally evaluated our approach using the ChEMBL dataset as well as a polymer datasets. In both dataset, the 2-stage VAE method is able to improve the property statistics significantly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCell Image Analysis Techniques · Machine Learning in Materials Science · Machine Learning and Data Classification
