# Diffusion-driven distillation and contrastive learning for class-incremental semantic segmentation of laparoscopic images

**Authors:** Xinkai Zhao, Yuichiro Hayashi, Masahiro Oda, Takayuki Kitasaka, Kensaku Mori

PMC · DOI: 10.1007/s11548-025-03405-1 · International Journal of Computer Assisted Radiology and Surgery · 2025-06-14

## TL;DR

This paper introduces a new method for improving laparoscopic image segmentation by using diffusion models and contrastive learning to adapt to new anatomical structures over time.

## Contribution

The paper presents the first class-incremental semantic segmentation approach for laparoscopic images using diffusion-driven distillation and contrastive learning.

## Key findings

- The method outperforms existing approaches in segmenting challenging anatomical structures like the ureter and vesicular glands.
- Cross-image contrastive learning enhances the model's ability to distinguish subtle variations in laparoscopic images.
- The approach adapts well to new anatomical classes without reusing previous training data.

## Abstract

Understanding anatomical structures in laparoscopic images is crucial for various types of laparoscopic surgery. However, creating specialized datasets for each type is both inefficient and challenging. This highlights the clinical significance of exploring class-incremental semantic segmentation (CISS) for laparoscopic images. Although CISS has been widely studied in diverse image datasets, in clinical settings, incremental data typically consists of new patient images rather than reusing previous images, necessitating a novel algorithm.

We introduce a distillation approach driven by a diffusion model for CISS of laparoscopic images. Specifically, an unconditional diffusion model is trained to generate synthetic laparoscopic images, which are then incorporated into subsequent training steps. A distillation network is employed to extract and transfer knowledge from networks trained in earlier steps. Additionally, to address the challenge posed by the limited semantic information available in individual laparoscopic images, we employ cross-image contrastive learning, enhancing the model’s ability to distinguish subtle variations across images.

Our method was trained and evaluated on all 11 anatomical structures from the Dresden Surgical Anatomy Dataset, which presents significant challenges due to its dispersed annotations. Extensive experiments demonstrate that our approach outperforms other methods, especially in difficult categories such as the ureter and vesicular glands, where it surpasses even supervised offline learning.

This study is the first to address class-incremental semantic segmentation for laparoscopic images, significantly improving the adaptability of segmentation models to new anatomical classes in surgical procedures.

The online version contains supplementary material available at 10.1007/s11548-025-03405-1.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12226607/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12226607/full.md

## References

4 references — full list in the complete paper: https://tomesphere.com/paper/PMC12226607/full.md

---
Source: https://tomesphere.com/paper/PMC12226607