PhenoLIP: Integrating Phenotype Ontology Knowledge into Medical Vision-Language Pretraining

Cheng Liang; Chaoyi Wu; Weike Zhao; Ya Zhang; Yanfeng Wang; Weidi Xie

arXiv:2602.06184·cs.CV·February 9, 2026

PhenoLIP: Integrating Phenotype Ontology Knowledge into Medical Vision-Language Pretraining

Cheng Liang, Chaoyi Wu, Weike Zhao, Ya Zhang, Yanfeng Wang, Weidi Xie

PDF

Open Access

TL;DR

This paper introduces PhenoLIP, a novel framework that integrates structured phenotype ontology knowledge into medical vision-language models, significantly improving phenotype recognition and retrieval accuracy in medical imaging tasks.

Contribution

It constructs PhenoKG, a large-scale phenotype knowledge graph, and develops PhenoLIP, a two-stage pretraining method that incorporates structured phenotype knowledge into medical VLMs.

Findings

01

PhenoLIP improves phenotype classification accuracy by 8.85%.

02

PhenoLIP enhances cross-modal retrieval performance by 15.03%.

03

The approach demonstrates the benefit of integrating phenotype priors into medical VLMs.

Abstract

Recent progress in large-scale CLIP-like vision-language models(VLMs) has greatly advanced medical image analysis. However, most existing medical VLMs still rely on coarse image-text contrastive objectives and fail to capture the systematic visual knowledge encoded in well-defined medical phenotype ontologies. To address this gap, we construct PhenoKG, the first large-scale, phenotype-centric multimodal knowledge graph that encompasses over 520K high-quality image-text pairs linked to more than 3,000 phenotypes. Building upon PhenoKG, we propose PhenoLIP, a novel pretraining framework that explicitly incorporates structured phenotype knowledge into medical VLMs through a two-stage process. We first learn a knowledge-enhanced phenotype embedding space from textual ontology data and then distill this structured knowledge into multimodal pretraining via a teacher-guided knowledge…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Biomedical Text Mining and Ontologies · Domain Adaptation and Few-Shot Learning