Multimodal Mixture-of-Experts with Retrieval Augmentation for Protein Active Site Identification
Jiayang Wu, Jiale Zhou, Rubo Wang, Xingyi Zhang, Xun Lin, Tianxu Lv, Leong Hou U, Yefeng Zheng

TL;DR
This paper introduces MERA, a retrieval-augmented multimodal framework with reliability-aware fusion for protein active site identification, achieving state-of-the-art results by effectively integrating multiple data modalities and estimating their trustworthiness.
Contribution
MERA is the first retrieval-augmented, multimodal mixture-of-experts framework with a reliability-aware fusion strategy for protein active site prediction, addressing data sparsity and modality reliability issues.
Findings
Achieves 90% AUPRC on active site prediction
Significant improvements on peptide-binding site identification
Validated effectiveness of retrieval-augmented multi-expert modeling
Abstract
Accurate identification of protein active sites at the residue level is crucial for understanding protein function and advancing drug discovery. However, current methods face two critical challenges: vulnerability in single-instance prediction due to sparse training data, and inadequate modality reliability estimation that leads to performance degradation when unreliable modalities dominate fusion processes. To address these challenges, we introduce Multimodal Mixture-of-Experts with Retrieval Augmentation (MERA), the first retrieval-augmented framework for protein active site identification. MERA employs hierarchical multi-expert retrieval that dynamically aggregates contextual information from chain, sequence, and active-site perspectives through residue-level mixture-of-experts gating. To prevent modality degradation, we propose a reliability-aware fusion strategy based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Bioinformatics · vaccines and immunoinformatics approaches
