TL;DR
MIRAGE is a multimodal system that retrieves and generates medical images and texts, enhancing interactive learning for medical students by mapping data to a shared semantic space using publicly available models.
Contribution
It introduces a novel multimodal retrieval and generation system for medical education based on fine-tuned CLIP and diffusion models, accessible via Kaggle.
Findings
Enables retrieval of clinically relevant images from trustworthy sources.
Allows generation of synthetic medical images through prompts.
Provides enriched descriptions and visual comparisons of medical conditions.
Abstract
Access to diverse, well-annotated medical images with interactive learning tools is fundamental for training practitioners in medicine and related fields to improve their diagnostic skills and understanding of anatomical structures. While medical atlases are valuable, they are often impractical due to their size and lack of interactivity, whereas online image search may provide mislabeled or incomplete material. To address this, we propose MIRAGE, a multimodal medical text and image retrieval and generation system that allows users to find and generate clinically relevant images from trustworthy sources by mapping both text and images to a shared latent space, enabling semantically meaningful queries. The system is based on a fine-tuned medical version of CLIP (MedICaT-ROCO), trained with the ROCO dataset, obtained from PubMed Central. MIRAGE allows users to give prompts to retrieve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
