GastroDL-Fusion: A Dual-Modal Deep Learning Framework Integrating Protein-Ligand Complexes and Gene Sequences for Gastrointestinal Disease Drug Discovery
Ziyang Gao, Annie Cheung, Yihao Ou

TL;DR
GastroDL-Fusion is a dual-modal deep learning framework that combines protein-ligand structural data with gene sequences to improve drug discovery for gastrointestinal diseases.
Contribution
It introduces a novel approach integrating molecular graphs and gene embeddings via a multi-modal neural network for enhanced binding affinity prediction.
Findings
Achieved MAE of 1.12 and RMSE of 1.75, outperforming baseline models.
Significantly improved prediction accuracy by combining structural and genetic features.
Demonstrated effectiveness on GI disease-related target datasets.
Abstract
Accurate prediction of protein-ligand binding affinity plays a pivotal role in accelerating the discovery of novel drugs and vaccines, particularly for gastrointestinal (GI) diseases such as gastric ulcers, Crohn's disease, and ulcerative colitis. Traditional computational models often rely on structural information alone and thus fail to capture the genetic determinants that influence disease mechanisms and therapeutic responses. To address this gap, we propose GastroDL-Fusion, a dual-modal deep learning framework that integrates protein-ligand complex data with disease-associated gene sequence information for drug and vaccine development. In our approach, protein-ligand complexes are represented as molecular graphs and modeled using a Graph Isomorphism Network (GIN), while gene sequences are encoded into biologically meaningful embeddings via a pre-trained Transformer (ProtBERT/ESM).…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Bioinformatics and Genomic Networks · Machine Learning in Bioinformatics
