Generative Enriched Sequential Learning (ESL) Approach for Molecular Design via Augmented Domain Knowledge
Mohammad Sajjad Ghaemi, Karl Grantham, Isaac Tamblyn, Yifeng Li, Hsu, Kiang Ooi

TL;DR
This paper introduces an enriched sequential learning (ESL) approach that incorporates domain knowledge, such as drug-likeness scores, into generative models to improve the design of novel molecules with desirable properties.
Contribution
The paper proposes a novel ESL method that integrates domain knowledge into sequential learning models, enhancing their ability to generate molecules with targeted characteristics.
Findings
ESL improves the quality of generated molecules with higher QED scores.
Incorporating domain knowledge reduces bias towards prevalent but less desirable molecules.
The approach demonstrates better learning of specific chemical patterns.
Abstract
Deploying generative machine learning techniques to generate novel chemical structures based on molecular fingerprint representation has been well established in molecular design. Typically, sequential learning (SL) schemes such as hidden Markov models (HMM) and, more recently, in the sequential deep learning context, recurrent neural network (RNN) and long short-term memory (LSTM) were used extensively as generative models to discover unprecedented molecules. To this end, emission probability between two states of atoms plays a central role without considering specific chemical or physical properties. Lack of supervised domain knowledge can mislead the learning procedure to be relatively biased to the prevalent molecules observed in the training data that are not necessarily of interest. We alleviated this drawback by augmenting the training data with domain knowledge, e.g.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Chemistry and Chemical Engineering · Machine Learning in Materials Science
