Deep learning to generate in silico chemical property libraries and candidate molecules for small molecule identification in complex samples
Sean M. Colby, Jamie R. Nu\~nez, Nathan O. Hodas, Courtney D. Corley,, Ryan R. Renslow

TL;DR
This paper introduces a deep learning framework using a variational autoencoder to generate in silico chemical property libraries and candidate molecules, significantly enhancing small molecule identification in complex samples.
Contribution
It develops a novel VAE-based method with multitask training to expand reference libraries and generate molecules with desired properties, overcoming current limitations.
Findings
Successfully predicts chemical properties from structure
Generates candidate molecules with specific properties
Enables rapid molecule identification and design
Abstract
Comprehensive and unambiguous identification of small molecules in complex samples will revolutionize our understanding of the role of metabolites in biological systems. Existing and emerging technologies have enabled measurement of chemical properties of molecules in complex mixtures and, in concert, are sensitive enough to resolve even stereoisomers. Despite these experimental advances, small molecule identification is inhibited by (i) chemical reference libraries representing <1% of known molecules, limiting the number of possible identifications, and (ii) the lack of a method to generate candidate matches directly from experimental features (i.e. without a library). To this end, we developed a variational autoencoder (VAE) to learn a continuous numerical, or latent, representation of molecular structure to expand reference libraries for small molecule identification. We extended the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Various Chemistry Research Topics · Analytical Chemistry and Chromatography
