# Enhancing colorectal polyp classification using gaze-based attention networks

**Authors:** Zhenghao Guo, Yanyan Hu, Peixuan Ge, In Neng Chan, Tao Yan, Pak Kin Wong, Shaoyong Xu, Zheng Li, Shan Gao

PMC · DOI: 10.7717/peerj-cs.2780 · 2025-03-25

## TL;DR

This paper introduces a method that uses endoscopists' gaze patterns to improve the accuracy of classifying colorectal polyps using AI during endoscopy.

## Contribution

The novel use of gaze-based attention as an auxiliary signal to train CNNs for colorectal polyp classification.

## Key findings

- EfficientNet_b1 with gaze supervision achieved 86.96% test accuracy, outperforming the model without gaze data.
- The model's class activation maps showed improved alignment with endoscopists' attention patterns.
- The method demonstrated higher precision, recall, F1 score, and AUC compared to the baseline model.

## Abstract

Colorectal polyps are potential precursor lesions of colorectal cancer. Accurate classification of colorectal polyps during endoscopy is crucial for early diagnosis and effective treatment. Automatic and accurate classification of colorectal polyps based on convolutional neural networks (CNNs) during endoscopy is vital for assisting endoscopists in diagnosis and treatment. However, this task remains challenging due to difficulties in the data acquisition and annotation processes, the poor interpretability of the data output, and the lack of widespread acceptance of the CNN models by clinicians. This study proposes an innovative approach that utilizes gaze attention information from endoscopists as an auxiliary supervisory signal to train a CNN-based model for the classification of colorectal polyps. Gaze information from the reading of endoscopic images was first recorded through an eye-tracker. Then, the gaze information was processed and applied to supervise the CNN model’s attention via an attention consistency module. Comprehensive experiments were conducted on a dataset that contained three types of colorectal polyps. The results showed that EfficientNet_b1 with supervised gaze information achieved an overall test accuracy of 86.96%, a precision of 87.92%, a recall of 88.41%, an F1 score of 88.16%, the area under the receiver operating characteristic (ROC) curve (AUC) is 0.9022. All evaluation metrics surpassed those of EfficientNet_b1 without gaze information supervision. The class activation maps generated by the proposed network also indicate that the endoscopist’s gaze-attention information, as auxiliary prior knowledge, increases the accuracy of colorectal polyp classification, offering a new solution to the field of medical image analysis.

## Linked entities

- **Diseases:** colorectal cancer (MONDO:0005575)

## Full-text entities

- **Diseases:** Colorectal polyps (MESH:D003111), colorectal cancer (MESH:D015179)

## Figures

24 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12190586/full.md

---
Source: https://tomesphere.com/paper/PMC12190586