S-MolSearch: 3D Semi-supervised Contrastive Learning for Bioactive Molecule Search
Gengmo Zhou, Zhen Wang, Feng Yu, Guolin Ke, Zhewei Wei, Zhifeng Gao

TL;DR
S-MolSearch introduces a semi-supervised contrastive learning framework that incorporates 3D molecular information and affinity data to improve ligand-based virtual screening, especially with limited and noisy data.
Contribution
It is the first framework to combine 3D molecular data and affinity information using semi-supervised contrastive learning for virtual screening.
Findings
Outperforms existing methods on LIT-PCBA and DUD-E benchmarks.
Achieves higher AUROC, BEDROC, and EF scores.
Effectively utilizes unlabeled data through inverse optimal transport.
Abstract
Virtual Screening is an essential technique in the early phases of drug discovery, aimed at identifying promising drug candidates from vast molecular libraries. Recently, ligand-based virtual screening has garnered significant attention due to its efficacy in conducting extensive database screenings without relying on specific protein-binding site information. Obtaining binding affinity data for complexes is highly expensive, resulting in a limited amount of available data that covers a relatively small chemical space. Moreover, these datasets contain a significant amount of inconsistent noise. It is challenging to identify an inductive bias that consistently maintains the integrity of molecular activity during data augmentation. To tackle these challenges, we propose S-MolSearch, the first framework to our knowledge, that leverages molecular 3D information and affinity information in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsComputational Drug Discovery Methods · Chemical Synthesis and Analysis
MethodsSoftmax · Attention Is All You Need · Contrastive Learning
