# Subclass-Aware Contrastive Semi-Supervised Learning for Inflammatory Bowel Disease Classification from Colonoscopy Images

**Authors:** Kechen Lin, Guangcong Ruan, Xiaoyang Zou, Yongjian Nian, Yanling Wei, Guoyan Zheng

PMC · DOI: 10.3390/bioengineering13010008 · Bioengineering · 2025-12-22

## TL;DR

This paper introduces a new semi-supervised learning method for classifying inflammatory bowel disease from colonoscopy images, achieving high accuracy with limited labeled data.

## Contribution

The novel SACSSL method integrates subclass-aware contrastive learning to reduce confirmation bias and improve performance in semi-supervised IBD classification.

## Key findings

- SACSSL achieves 93.2% accuracy and 80.1% F1-score on the Daping dataset with only 20% labeled data.
- The method reaches 76.4% accuracy and 68.9% F1-score on the LIMUC dataset for UC severity grading.
- Results show SACSSL outperforms existing methods in semi-supervised colonoscopy image classification.

## Abstract

Inflammatory bowel disease (IBD) includes Crohn’s disease (CD) and ulcerative colitis (UC). The accurate classification of IBD from colonoscopy images is critical for diagnosis and treatment. However, the lack of labeled data poses a major challenge for developing deep learning-based IBD classification approaches. Recently, pseudo-labeling-based semi-supervised learning methods offer a promising solution in leveraging both labeled and unlabeled data to improve classification performance. Nevertheless, due to significant intra-class variability and the subtle inter-class differences in IBD colonoscopy images, pseudo-labels are often inaccurate, which results in confirmation bias and suboptimal performance. To address this challenge, a Subclass-Aware Contrastive Semi-Supervised Learning method, referred to as SACSSL, is proposed for accurate IBD classification by integrating a subclass-aware contrastive module into a pseudo-labeling-based semi-supervised framework, e.g., FixMatch. Specifically, unlabeled samples are first partitioned into confident and uncertain samples according to the confidence of pseudo-labels. An instance-level contrastive loss is then applied to uncertain samples, aiming to mitigate confirmation bias. Furthermore, intra-class heterogeneity is captured by introducing a set of prototypes for each subclass and assigning confident samples to these prototypes to form fine-grained subclasses, and supervised contrastive loss is applied to promote intra-subclass clustering, thereby enhancing inter-class separability while preserving intra-class diversity. Our method is evaluated on two datasets, i.e., an in-house collected Daping dataset for IBD classification and a publicly available LIMUC dataset for UC severity grading. On both datasets, our method achieves state-of-the-art performance under the semi-supervised setting. Specifically, with only 20% labeled data, the proposed method reaches an overall accuracy of 93.2% and an F1-score of 80.1% on the Daping dataset, which is close to the fully supervised upper bound (94.0% accuracy and 80.8% F1-score), and it achieves an overall accuracy of 76.4% and an F1-score of 68.9% on the LIMUC dataset. Comprehensive experimental results demonstrate the effectiveness of our method for semi-supervised colonoscopy image classification.

## Linked entities

- **Diseases:** Inflammatory bowel disease (MONDO:0005265), Crohn’s disease (MONDO:0005011), ulcerative colitis (MONDO:0005101)

## Full-text entities

- **Diseases:** CD (MESH:D003424), IBD (MESH:D015212), UC (MESH:D003093)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12838207/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12838207/full.md

## References

66 references — full list in the complete paper: https://tomesphere.com/paper/PMC12838207/full.md

---
Source: https://tomesphere.com/paper/PMC12838207