# Minimal data requirement for realistic endoscopic image generation with Stable Diffusion

**Authors:** Joanna Kaleta, Diego Dall’Alba, Szymon Płotka, Przemysław Korzeniowski

PMC · DOI: 10.1007/s11548-023-03030-w · International Journal of Computer Assisted Radiology and Surgery · 2023-11-07

## TL;DR

This paper introduces a method to generate realistic endoscopic images from synthetic data using a Stable Diffusion model, requiring minimal real-world data.

## Contribution

The novel approach uses Stable Diffusion with control networks to generate realistic endoscopic images with minimal real data and improved detail control.

## Key findings

- The method achieves a mean Intersection over Union of 69.76% in laparoscopic cholecystectomy tasks.
- It significantly outperforms baseline results (69.76 vs. 42.21%).
- The approach reduces the domain gap between synthetic and real endoscopic images.

## Abstract

Computer-assisted surgical systems provide support information to the surgeon, which can improve the execution and overall outcome of the procedure. These systems are based on deep learning models that are trained on complex and challenging-to-annotate data. Generating synthetic data can overcome these limitations, but it is necessary to reduce the domain gap between real and synthetic data.

We propose a method for image-to-image translation based on a Stable Diffusion model, which generates realistic images starting from synthetic data. Compared to previous works, the proposed method is better suited for clinical application as it requires a much smaller amount of input data and allows finer control over the generation of details by introducing different variants of supporting control networks.

The proposed method is applied in the context of laparoscopic cholecystectomy, using synthetic and real data from public datasets. It achieves a mean Intersection over Union of 69.76%, significantly improving the baseline results (69.76 vs. 42.21%).

The proposed method for translating synthetic images into images with realistic characteristics will enable the training of deep learning methods that can generalize optimally to real-world contexts, thereby improving computer-assisted intervention guidance systems.

## Full-text entities

- **Diseases:** LC (MESH:D017562), SD (MESH:D060050), DL (MESH:D007859), DM (MESH:D009223), BDI (MESH:D001649)
- **Chemicals:** GAN (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC10881618/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC10881618/full.md

## References

18 references — full list in the complete paper: https://tomesphere.com/paper/PMC10881618/full.md

---
Source: https://tomesphere.com/paper/PMC10881618