MM-Skin: Enhancing Dermatology Vision-Language Model with an Image-Text Dataset Derived from Textbooks
Wenqi Zeng, Yuqi Sun, Chenxi Ma, Weimin Tan, Bo Yan

TL;DR
This paper introduces MM-Skin, a large-scale dermatology dataset with diverse image-text pairs and a specialized vision-language model, SkinVL, that significantly improves skin disease diagnosis and interpretation in clinical settings.
Contribution
The paper presents the first extensive dermatology multimodal dataset and a tailored vision-language model, advancing clinical dermatology AI capabilities.
Findings
SkinVL outperforms existing models on multiple dermatology tasks.
MM-Skin dataset enables more accurate and nuanced skin disease analysis.
SkinVL demonstrates strong zero-shot and fine-tuned performance across 8 datasets.
Abstract
Medical vision-language models (VLMs) have shown promise as clinical assistants across various medical fields. However, specialized dermatology VLM capable of delivering professional and detailed diagnostic analysis remains underdeveloped, primarily due to less specialized text descriptions in current dermatology multimodal datasets. To address this issue, we propose MM-Skin, the first large-scale multimodal dermatology dataset that encompasses 3 imaging modalities, including clinical, dermoscopic, and pathological and nearly 10k high-quality image-text pairs collected from professional textbooks. In addition, we generate over 27k diverse, instruction-following vision question answering (VQA) samples (9 times the size of current largest dermatology VQA dataset). Leveraging public datasets and MM-Skin, we developed SkinVL, a dermatology-specific VLM designed for precise and nuanced skin…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCutaneous Melanoma Detection and Management · Multimodal Machine Learning Applications · AI in cancer detection
