PhenoLIP: Integrating Phenotype Ontology Knowledge into Medical Vision-Language Pretraining
Cheng Liang, Chaoyi Wu, Weike Zhao, Ya Zhang, Yanfeng Wang, Weidi Xie

TL;DR
This paper introduces PhenoLIP, a novel framework that integrates structured phenotype ontology knowledge into medical vision-language models, significantly improving phenotype recognition and retrieval accuracy in medical imaging tasks.
Contribution
It constructs PhenoKG, a large-scale phenotype knowledge graph, and develops PhenoLIP, a two-stage pretraining method that incorporates structured phenotype knowledge into medical VLMs.
Findings
PhenoLIP improves phenotype classification accuracy by 8.85%.
PhenoLIP enhances cross-modal retrieval performance by 15.03%.
The approach demonstrates the benefit of integrating phenotype priors into medical VLMs.
Abstract
Recent progress in large-scale CLIP-like vision-language models(VLMs) has greatly advanced medical image analysis. However, most existing medical VLMs still rely on coarse image-text contrastive objectives and fail to capture the systematic visual knowledge encoded in well-defined medical phenotype ontologies. To address this gap, we construct PhenoKG, the first large-scale, phenotype-centric multimodal knowledge graph that encompasses over 520K high-quality image-text pairs linked to more than 3,000 phenotypes. Building upon PhenoKG, we propose PhenoLIP, a novel pretraining framework that explicitly incorporates structured phenotype knowledge into medical VLMs through a two-stage process. We first learn a knowledge-enhanced phenotype embedding space from textual ontology data and then distill this structured knowledge into multimodal pretraining via a teacher-guided knowledge…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Biomedical Text Mining and Ontologies · Domain Adaptation and Few-Shot Learning
