Using Molecular Embeddings in QSAR Modeling: Does it Make a Difference?

Mar\'ia Virginia Sabando; Ignacio Ponzoni; Evangelos E. Milios; Axel; J. Soto

arXiv:2104.02604·q-bio.BM·May 9, 2022

Using Molecular Embeddings in QSAR Modeling: Does it Make a Difference?

Mar\'ia Virginia Sabando, Ignacio Ponzoni, Evangelos E. Milios, Axel, J. Soto

PDF

1 Repo

TL;DR

This study systematically compares various molecular embedding techniques with traditional representations in QSAR modeling, revealing that embeddings do not significantly outperform traditional methods in predictive tasks.

Contribution

The paper provides a comprehensive experimental comparison of five molecular embedding methods against traditional descriptors and fingerprints in QSAR scenarios.

Findings

01

Molecular embeddings do not significantly outperform traditional representations in QSAR tasks.

02

Supervised embeddings are competitive with traditional methods, while unsupervised embeddings tend to perform worse.

03

A large-scale evaluation with over 25,000 models highlights the need for careful selection of molecular representations.

Abstract

With the consolidation of deep learning in drug discovery, several novel algorithms for learning molecular representations have been proposed. Despite the interest of the community in developing new methods for learning molecular embeddings and their theoretical benefits, comparing molecular embeddings with each other and with traditional representations is not straightforward, which in turn hinders the process of choosing a suitable representation for QSAR modeling. A reason behind this issue is the difficulty of conducting a fair and thorough comparison of the different existing embedding approaches, which requires numerous experiments on various datasets and training scenarios. To close this gap, we reviewed the literature on methods for molecular embeddings and reproduced three unsupervised and two supervised molecular embedding techniques recently proposed in the literature. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

VirginiaSabando/MolecularEmbeddings
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.