A Novel Framework for Multi-Modal Protein Representation Learning

Runjie Zheng; Zhen Wang; Anjie Qiao; Jiancong Xie; Jiahua Rao; Yuedong Yang

arXiv:2510.23273·cs.LG·October 28, 2025

A Novel Framework for Multi-Modal Protein Representation Learning

Runjie Zheng, Zhen Wang, Anjie Qiao, Jiancong Xie, Jiahua Rao, Yuedong Yang

PDF

TL;DR

This paper introduces DAMPE, a new framework that effectively fuses multi-modal protein data by aligning intrinsic embeddings with optimal transport and integrating extrinsic relational information through conditional graph generation, improving protein function prediction.

Contribution

DAMPE presents a novel, scalable method combining OT-based alignment and CGG-based fusion for multi-modal protein representation learning, addressing heterogeneity and noise issues.

Findings

01

DAMPE outperforms state-of-the-art methods on GO benchmarks.

02

OT-based alignment improves embedding consistency.

03

CGG-based fusion enhances information integration.

Abstract

Accurate protein function prediction requires integrating heterogeneous intrinsic signals (e.g., sequence and structure) with noisy extrinsic contexts (e.g., protein-protein interactions and GO term annotations). However, two key challenges hinder effective fusion: (i) cross-modal distributional mismatch among embeddings produced by pre-trained intrinsic encoders, and (ii) noisy relational graphs of extrinsic data that degrade GNN-based information aggregation. We propose Diffused and Aligned Multi-modal Protein Embedding (DAMPE), a unified framework that addresses these through two core mechanisms. First, we propose Optimal Transport (OT)-based representation alignment that establishes correspondence between intrinsic embedding spaces of different modalities, effectively mitigating cross-modal heterogeneity. Second, we develop a Conditional Graph Generation (CGG)-based information…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.