MaskMol: Knowledge-guided Molecular Image Pre-Training Framework for Activity Cliffs
Zhixiang Cheng, Hongxin Xiang, Pengsen Ma, Li Zeng, Xin Jin, Xixi, Yang, Jianxin Lin, Yang Deng, Bosheng Song, Xinxin Feng, Changhui Deng,, Xiangxiang Zeng

TL;DR
MaskMol is a knowledge-guided molecular image pre-training framework that effectively captures subtle structural differences in molecules, improving activity cliff estimation and drug discovery applications.
Contribution
The paper introduces MaskMol, a novel self-supervised learning framework that leverages molecular images and multiple levels of molecular knowledge for better activity cliff detection.
Findings
Outperforms 25 state-of-the-art models in activity cliff estimation.
Demonstrates high transferability across 20 macromolecular targets.
Provides high interpretability in identifying relevant molecular substructures.
Abstract
Activity cliffs, which refer to pairs of molecules that are structurally similar but show significant differences in their potency, can lead to model representation collapse and make the model challenging to distinguish them. Our research indicates that as molecular similarity increases, graph-based methods struggle to capture these nuances, whereas image-based approaches effectively retain the distinctions. Thus, we developed MaskMol, a knowledge-guided molecular image self-supervised learning framework. MaskMol accurately learns the representation of molecular images by considering multiple levels of molecular knowledge, such as atoms, bonds, and substructures. By utilizing pixel masking tasks, MaskMol extracts fine-grained information from molecular images, overcoming the limitations of existing deep learning models in identifying subtle structural changes. Experimental results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · Genetics, Bioinformatics, and Biomedical Research
