Molecular design for cardiac cell differentiation using a small dataset and decorated shape features
Fatemeh Etezadi, Shunichi Ito, Kosuke Yasui, Rodi Kado Abdalkader,, Itsunari Minami, Motonari Uesugi, Ganesh Pandian Namasivayam, Haruko Nakano,, Atsushi Nakano, Daniel M. Packwood

TL;DR
This paper presents a data-driven approach using decorated shape features and simple regression models to design new compounds for cardiac cell differentiation, effective even with limited training data.
Contribution
It introduces decorated shape descriptors and a conservative molecular design strategy for small datasets, enabling effective compound discovery for stem cell differentiation.
Findings
Improved model performance with decorated shape features
Successful design of a new cardiomyocyte-inducing compound
Experimental validation confirms compound efficacy
Abstract
The discovery of small organic compounds for inducing stem cell differentiation is a time- and resource-intensive process. While data science could, in principle, facilitate the discovery of these compounds, novel approaches are required due to the difficulty of acquiring training data from large numbers of example compounds. In this paper, we demonstrate the design of a new compound for inducing cardiomyocyte differentiation using simple regression models trained with a data set containing only 80 examples. We introduce decorated shape descriptors, an information-rich molecular feature representation that integrates both molecular shape and hydrophilicity information. These models demonstrate improved performance compared to ones using standard molecular descriptors based on shape alone. Model overtraining is diagnosed using a new type of sensitivity analysis. Our new compound is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTissue Engineering and Regenerative Medicine · 3D Printing in Biomedical Research · Cell Image Analysis Techniques
MethodsSparse Evolutionary Training
