Fine-Tuning a Small Vision Language Model Using Synthetic Data for Explaining Bacterial Skin Disease Images

Shiwan Zhang; Abdurrahim Yilmaz; Gulsum Gencoglan; Burak Temelkuran

PMC · DOI:10.3390/diagnostics16040603·February 18, 2026

Fine-Tuning a Small Vision Language Model Using Synthetic Data for Explaining Bacterial Skin Disease Images

Shiwan Zhang, Abdurrahim Yilmaz, Gulsum Gencoglan, Burak Temelkuran

PDF

Open Access

TL;DR

This paper explores using a small vision language model fine-tuned with synthetic data to explain bacterial skin disease images, achieving strong diagnostic performance.

Contribution

The study introduces a novel method of fine-tuning a compact VLM using synthetic QA supervision for dermatology image analysis.

Findings

01

QA-only supervision results in the best report-generation performance.

02

The combined QA+caption strategy achieves the highest classification accuracy of 70.20%.

03

Synthetic data effectively enhances compact VLMs for medical image understanding.

Abstract

Background/Objectives: Vision language models (VLMs) show strong potential for medical image understanding, but their large scale often limits practical deployment. This study investigates whether a compact VLM can be effectively adapted for dermatology, with a focus on explaining bacterial skin disease images. Methods: We curate a dataset derived from PMC-OA using the BIOMEDICA dataset and construct PMC-derma-VQA-bacteria by pairing images with inherited figure captions and synthetically generated question–answer (QA) supervision produced by Google’s Gemini model. SmolVLM is fine-tuned under three supervision settings: QA-only, caption-only, and a combined QA+caption strategy. The models are evaluated on a held-out test set for both text-generation quality and diagnostic classification performance. Results: QA-only supervision yields the best report-generation performance, while the…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Species2

Bacteria Latreille et al. 1825(Bacteria stick insect · genus)Homo sapiens(human · species)

Diseases15

bacterial skin disease acne vulgaris bacterial disease dermatological disorders hallucinations OA LLMs folliculitis pigmentation lesion/disease injury to Verrucae)Dermatophyte (tinea) infectionsCandidiasisPsoriasisLichenAlopecia AreataVitiligoAtopic VLMs PMC Skin ConditionsSweat

Figures4

Click any figure to enlarge with its caption.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Cell Image Analysis Techniques