All SMILES Variational Autoencoder
Zaccary Alperstein, Artem Cherkasov, Jason Tyler Rolfe

TL;DR
This paper introduces All SMILES VAE, a novel variational autoencoder that encodes multiple SMILES strings per molecule to create a near-bijective, molecule-based latent space, improving molecular property prediction and optimization.
Contribution
It proposes a new SMILES-based VAE that encodes multiple representations of molecules to enhance the latent space quality and property prediction accuracy.
Findings
Outperforms state-of-the-art in property regression tasks
Achieves near-bijective mapping between molecules and latent space
Improves molecular property optimization results
Abstract
Variational autoencoders (VAEs) defined over SMILES string and graph-based representations of molecules promise to improve the optimization of molecular properties, thereby revolutionizing the pharmaceuticals and materials industries. However, these VAEs are hindered by the non-unique nature of SMILES strings and the computational cost of graph convolutions. To efficiently pass messages along all paths through the molecular graph, we encode multiple SMILES strings of a single molecule using a set of stacked recurrent neural networks, pooling hidden representations of each atom between SMILES representations, and use attentional pooling to build a final fixed-length latent representation. By then decoding to a disjoint set of SMILES strings of the molecule, our All SMILES VAE learns an almost bijective mapping between molecules and latent representations near the high-probability-mass…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · Protein Structure and Dynamics
MethodsUSD Coin Customer Service Number +1-833-534-1729
