# Development and validation of an artificial intelligence-assisted system for automatic Boston scoring of bowel cleanliness in colonoscopy (with video)

**Authors:** Jian Chen, Jingzhi Xu, Kaijian Xia, Qiuwen Hua, Xiaodan Xu, Ganhong Wang

PMC · DOI: 10.3389/fpubh.2025.1708325 · Frontiers in Public Health · 2026-01-02

## TL;DR

This paper introduces an AI system that automatically scores bowel cleanliness during colonoscopies, improving accuracy and training for endoscopists.

## Contribution

A novel AI system using YOLOv11 for real-time and cumulative BBPS scoring in colonoscopy videos is developed and validated.

## Key findings

- The YOLOv11m model achieved 99.86% accuracy and 99.75% F1 score in validation.
- AutoBBPS outperformed junior endoscopists and matched senior ones in image-level scoring.
- The system provides real-time cumulative scoring and enhances training for endoscopists.

## Abstract

Bowel cleanliness is a critical factor affecting the detection of adenomatous polyps and early tumors. The Boston Bowel Preparation Scale (BBPS), a widely used evaluation tool, has limitations, including interobserver variability and insufficient standardized training. This study aims to develop an artificial intelligence-driven automatic BBPS scoring and teaching system.

Colonoscopy image and video data were collected from three centers between June 2019 and August 2024, categorized into different BBPS scores (0, 1, 2, 3), ileocecal part, and instrument operation frames. Transfer learning and fine-tuning were performed on four pre-trained YOLOv11 models. Performance metrics included accuracy, precision, sensitivity, and AUC. Grad-CAM was used to provide visual explanations of the best-performing model, which was further developed into a system capable of real-time and cumulative BBPS assessment for every video frame.

Among the four models, YOLOv11m performed the best, achieving an accuracy of 99.86%, precision of 99.74%, sensitivity of 99.74%, and an F1 score of 99.75% on the validation set. On the test set, the model attained a weighted average precision of 95.37%, specificity of 98.25%, and an AUC of 0.996. Based on this model, the AutoBBPS system was developed, which automatically initiates real-time cumulative BBPS scoring once the cecum is reached. In image-level human-machine comparison experiments, the system outperformed junior endoscopists in recognition accuracy and was comparable to senior endoscopists. Video-level human-machine comparison experiments further evaluated the accuracy of the AutoBBPS system against endoscopists under varying confidence thresholds.

The AutoBBPS system, developed using YOLOv11, provides real-time and cumulative BBPS scoring for every video frame, effectively assisting endoscopists in improving scoring efficiency and accuracy. Additionally, the intelligent BBPS teaching assistant is particularly beneficial for junior endoscopists, promoting standardized training and enhancing overall scoring quality.

## Full-text entities

- **Diseases:** tumors (MESH:D009369), adenomatous polyps (MESH:D018256)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12808384/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12808384/full.md

## References

17 references — full list in the complete paper: https://tomesphere.com/paper/PMC12808384/full.md

---
Source: https://tomesphere.com/paper/PMC12808384