BioBridge: Bridging Biomedical Foundation Models via Knowledge Graphs
Zifeng Wang, Zichen Wang, Balasubramaniam Srinivasan, Vassilis N., Ioannidis, Huzefa Rangwala, Rishita Anubhai

TL;DR
BioBridge is a novel framework that uses knowledge graphs to connect unimodal biomedical foundation models, enabling effective multimodal tasks without retraining the original models.
Contribution
It introduces a parameter-efficient method to bridge unimodal models using knowledge graphs, facilitating multimodal biomedical applications without fine-tuning base models.
Findings
Outperforms baseline KG embedding methods by 76.3% in cross-modal retrieval.
Demonstrates strong out-of-domain generalization to unseen modalities.
Enhances biomedical multimodal question answering and drug generation.
Abstract
Foundation models (FMs) are able to leverage large volumes of unlabeled data to demonstrate superior performance across a wide range of tasks. However, FMs developed for biomedical domains have largely remained unimodal, i.e., independently trained and used for tasks on protein sequences alone, small molecule structures alone, or clinical data alone. To overcome this limitation of biomedical FMs, we present BioBridge, a novel parameter-efficient learning framework, to bridge independently trained unimodal FMs to establish multimodal behavior. BioBridge achieves it by utilizing Knowledge Graphs (KG) to learn transformations between one unimodal FM and another without fine-tuning any underlying unimodal FMs. Our empirical results demonstrate that BioBridge can beat the best baseline KG embedding methods (on average by around 76.3%) in cross-modal retrieval tasks. We also identify…
Peer Reviews
Decision·ICLR 2024 poster
- the problem of aligning large unimodal models efficiently is very relevant in general and even more so when the modalities are proteins, drugs and diseases (focus of this submission) as it opens up plethora of clinical applications. - the problem is well motivated in introduction and contextualised. The paper is clearly written and easy to follow except one section (see below). - the idea of aligning the different embedding space of unimodal pretrained models with a cross modal transformati
- Presentation of Related work : Currently the submission only has one paragraph on knowledge graph learning and barely describes the embedding alignment literature e.g. in the context of cross modal retrieval (one of the application of proposed method), one can also mention deep CCA literature as Canonical correlation analysis (CCA) is the core of many cross modal retrieval methods. - Presentation of Methodology: the submission should motivate the solution somewhat intuitively. Section 3.2
- This paper proposes a novel concept for learning across modalities via the bridging of knowledge graphs. - The authors have conducted extensive experiments on various types of entity mapping and numerous approaches to tail entity prediction. - With only the bridge module requiring updates during training, and all base feature models (FMs) remaining fixed, the proposed method is computationally efficient. - Overall, the paper is well-written, with clear explanations of the methodology and empir
- The term "multimodal" as mentioned in this paper is confined to different types of biomedical entities. While the authors compare their work with "ImageBind," the experimental section lacks tasks that bridge text and image modalities, which are more complex and crucial for multimodal foundation models. - The learning process is guided by knowledge graphs, limiting the scope of "modalities" to those represented within biomedical knowledge graphs. Therefore, instead of the broad term "biomedical
1. The idea of bridging several unimodal FMs is novel. 2. The paper is well written and easy to follow. 3. Compared to existing studies, BioBRIDGE keeps all unimodal FMs fixed and thus is parameter efficient.
1. The reasons of using contrastive learning is not clear. It would be better to provide further explanation and experimental supports. 2. The baselines in Section 4.1 are several years ago. No recent studies are included.
Code & Models
Videos
Taxonomy
Topicsvaccines and immunoinformatics approaches · Machine Learning in Bioinformatics · Computational Drug Discovery Methods
