Objective-Agnostic Enhancement of Molecule Properties via Multi-Stage VAE
Chenghui Zhou, Barnabas Poczos

TL;DR
This paper introduces a multi-stage VAE approach that enhances molecule property generation in drug discovery by improving manifold recovery without relying on property predictors, demonstrating significant performance gains on multiple datasets.
Contribution
The paper presents a novel multi-stage VAE method that improves molecule generation quality and property statistics, especially on low-dimensional manifolds, without using property predictors during training.
Findings
Multi-stage VAE improves manifold recovery on synthetic data.
Enhanced molecule property statistics on ChEMBL dataset.
Increased active molecule generation compared to one-stage models.
Abstract
Variational autoencoder (VAE) is a popular method for drug discovery and various architectures and pipelines have been proposed to improve its performance. However, VAE approaches are known to suffer from poor manifold recovery when the data lie on a low-dimensional manifold embedded in a higher dimensional ambient space [Dai and Wipf, 2019]. The consequences of it in drug discovery are somewhat under-explored. In this paper, we explore applying a multi-stage VAE approach, that can improve manifold recovery on a synthetic dataset, to the field of drug discovery. We experimentally evaluate our multi-stage VAE approach using the ChEMBL dataset and demonstrate its ability to improve the property statistics of generated molecules substantially from pre-existing methods without incorporating property predictors into the training pipeline. We further fine-tune our models on two curated and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCell Image Analysis Techniques · Machine Learning and Data Classification · Molecular Biology Techniques and Applications
