Variational embedding of protein folding simulations using gaussian mixture variational autoencoders
Mahdi Ghorbani, Samarjeet Prasad, Jeffery B. Klauda, Bernard R. Brooks

TL;DR
This paper introduces GMVAE, a machine learning approach that reduces dimensionality and clusters biomolecular conformations, effectively capturing the protein folding landscape and enabling kinetic analysis.
Contribution
The novel GMVAE method combines dimensionality reduction and clustering for protein folding data using a Gaussian mixture prior and Gumbel-softmax, providing interpretable free energy landscapes.
Findings
GMVAE captures the folding funnel structure.
Latent space enables accurate kinetic analysis.
Results agree with established dynamical models.
Abstract
Conformational sampling of biomolecules using molecular dynamics simulations often produces large amount of high dimensional data that makes it difficult to interpret using conventional analysis techniques. Dimensionality reduction methods are thus required to extract useful and relevant information. Here we devise a machine learning method, Gaussian mixture variational autoencoder (GMVAE) that can simultaneously perform dimensionality reduction and clustering of biomolecular conformations in an unsupervised way. We show that GMVAE can learn a reduced representation of the free energy landscape of protein folding with highly separated clusters that correspond to the metastable states during folding. Since GMVAE uses a mixture of Gaussians as the prior, it can directly acknowledge the multi-basin nature of protein folding free-energy landscape. To make the model end-to-end…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsGaussian Mixture Variational Autoencoder
