Empowering Chemical Structures with Biological Insights for Scalable Phenotypic Virtual Screening
Xiaoqing Lian, Pengsen Ma, Tengfeng Ma, Zhonghao Ren, Xibao Cai, Zhixiang Cheng, Bosheng Song, He Wang, Xiang Pan, Yangyang Chen, Sisi Yuan, Chen Lin

TL;DR
DECODE is a framework that enhances chemical structure representations with biological insights, enabling scalable and biologically meaningful virtual screening without requiring biological data at inference.
Contribution
This work introduces DECODE, a novel method that incorporates biological signals into chemical representations using limited paired data, improving drug similarity and discovery performance.
Findings
Over 20% improvement in mechanism-of-action prediction in zero-shot settings.
Six-fold increase in hit rates for new anti-cancer agents.
Effective filtering of experimental noise in biological data.
Abstract
Motivation: The scalable identification of bioactive compounds is essential for contemporary drug discovery. This process faces a key trade-off: structural screening offers scalability but lacks biological context, whereas high-content phenotypic profiling provides deep biological insights but is resource-intensive. The primary challenge is to extract robust biological signals from noisy data and encode them into representations that do not require biological data at inference. Results: This study presents DECODE (DEcomposing Cellular Observations of Drug Effects), a framework that bridges this gap by empowering chemical representations with intrinsic biological semantics to enable structure-based in silico biological profiling. DECODE leverages limited paired transcriptomic and morphological data as supervisory signals during training, enabling the extraction of a measurement-invariant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCell Image Analysis Techniques · Computational Drug Discovery Methods · Machine Learning in Bioinformatics
