Med-Banana-50K: A Cross-modality Large-Scale Dataset for Text-guided Medical Image Editing

Zhihui Chen; Mengling Feng

arXiv:2511.00801·cs.CV·November 10, 2025

Med-Banana-50K: A Cross-modality Large-Scale Dataset for Text-guided Medical Image Editing

Zhihui Chen, Mengling Feng

PDF

Open Access 1 Datasets

TL;DR

Med-Banana-50K is a large-scale, high-quality dataset of over 50,000 medically curated image edits across various modalities, designed to advance research in medical image editing with strict clinical constraints and quality control.

Contribution

The paper introduces Med-Banana-50K, a comprehensive dataset with a novel quality control protocol and extensive evaluation logs, supporting reliable medical image editing research.

Findings

01

Over 50,000 curated medical image edits across multiple modalities.

02

Inclusion of 37,000 failed editing attempts with evaluation logs.

03

A new LLM-based quality control framework for medical image editing.

Abstract

Medical image editing has emerged as a pivotal technology with broad applications in data augmentation, model interpretability, medical education, and treatment simulation. However, the lack of large-scale, high-quality, and openly accessible datasets tailored for medical contexts with strict anatomical and clinical constraints has significantly hindered progress in this domain. To bridge this gap, we introduce Med-Banana-50K, a comprehensive dataset of over 50k medically curated image edits spanning chest X-ray, brain MRI, and fundus photography across 23 diseases. Each sample supports bidirectional lesion editing (addition and removal) and is constructed using Gemini-2.5-Flash-Image based on real clinical images. A key differentiator of our dataset is the medically grounded quality control protocol: we employ an LLM-as-Judge evaluation framework with criteria such as instruction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

RichardChenZH/Med-Banana-50K
dataset· 76k dl
76k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCell Image Analysis Techniques · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis