Augmenting Molecular Deep Generative Models with Topological Data Analysis Representations
Yair Schiff, Vijil Chenthamarakshan, Samuel Hoffman, Karthikeyan, Natesan Ramamurthy, Payel Das

TL;DR
This paper introduces a novel augmentation of deep generative models with topological data analysis representations, specifically persistence images, to better encode 3D molecular geometry, leading to improved molecular generation and property prediction.
Contribution
It proposes integrating TDA-based persistence images into molecular generative models, enhancing their ability to capture 3D structural information and outperform existing models on benchmark tasks.
Findings
TDA augmentation improves structural modeling accuracy.
Generated molecules are valid, novel, and diverse.
Enhanced electronic property distribution, especially smaller HOMO-LUMO gaps.
Abstract
Deep generative models have emerged as a powerful tool for learning useful molecular representations and designing novel molecules with desired properties, with applications in drug discovery and material design. However, most existing deep generative models are restricted due to lack of spatial information. Here we propose augmentation of deep generative models with topological data analysis (TDA) representations, known as persistence images, for robust encoding of 3D molecular geometry. We show that the TDA augmentation of a character-based Variational Auto-Encoder (VAE) outperforms state-of-the-art generative neural nets in accurately modeling the structural composition of the QM9 benchmark. Generated molecules are valid, novel, and diverse, while exhibiting distinct electronic property distribution, namely higher sample population with small HOMO-LUMO gap. These results demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis · Cell Image Analysis Techniques · Image Retrieval and Classification Techniques
