GraphSeqLM: A Unified Graph Language Framework for Omic Graph Learning
Heming Zhang, Di Huang, Yixin Chen, Fuhai Li

TL;DR
GraphSeqLM introduces a novel framework combining graph neural networks with large language model-derived biological sequence embeddings to improve multi-omic data analysis for complex disease understanding.
Contribution
It is the first to integrate LLM-generated sequence embeddings with GNNs for enhanced biological data analysis in a unified framework.
Findings
Achieves higher predictive accuracy than existing methods.
Effectively captures complex biological relationships.
Enhances multi-omic data integration for precision medicine.
Abstract
The integration of multi-omic data is pivotal for understanding complex diseases, but its high dimensionality and noise present significant challenges. Graph Neural Networks (GNNs) offer a robust framework for analyzing large-scale signaling pathways and protein-protein interaction networks, yet they face limitations in expressivity when capturing intricate biological relationships. To address this, we propose Graph Sequence Language Model (GraphSeqLM), a framework that enhances GNNs with biological sequence embeddings generated by Large Language Models (LLMs). These embeddings encode structural and biological properties of DNA, RNA, and proteins, augmenting GNNs with enriched features for analyzing sample-specific multi-omic data. By integrating topological, sequence-derived, and biological information, GraphSeqLM demonstrates superior predictive accuracy and outperforms existing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Biomedical Text Mining and Ontologies · Semantic Web and Ontologies
