VisionCLIP: An Med-AIGC based Ethical Language-Image Foundation Model   for Generalizable Retina Image Analysis

Hao Wei; Bowen Liu; Minqing Zhang; Peilun Shi; Wu Yuan

arXiv:2403.10823·cs.CV·March 19, 2024·1 cites

VisionCLIP: An Med-AIGC based Ethical Language-Image Foundation Model for Generalizable Retina Image Analysis

Hao Wei, Bowen Liu, Minqing Zhang, Peilun Shi, Wu Yuan

PDF

Open Access

TL;DR

VisionCLIP is an ethical, synthetic data-based foundation model for retina image analysis that achieves competitive zero-shot performance, addressing privacy concerns in medical AI training.

Contribution

This work introduces VisionCLIP, a novel medical foundation model trained on synthetic images and text, enabling privacy-preserving, generalizable retina analysis.

Findings

01

Achieves competitive zero-shot performance on external datasets.

02

Utilizes 1 million synthetic fundus images with descriptions.

03

Circumvents patient privacy issues in medical AI training.

Abstract

Generalist foundation model has ushered in newfound capabilities in medical domain. However, the contradiction between the growing demand for high-quality annotated data with patient privacy continues to intensify. The utilization of medical artificial intelligence generated content (Med-AIGC) as an inexhaustible resource repository arises as a potential solution to address the aforementioned challenge. Here we harness 1 million open-source synthetic fundus images paired with natural language descriptions, to curate an ethical language-image foundation model for retina image analysis named VisionCLIP. VisionCLIP achieves competitive performance on three external datasets compared with the existing method pre-trained on real-world data in a zero-shot fashion. The employment of artificially synthetic images alongside corresponding textual data for training enables the medical foundation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRetinal Imaging and Analysis