Colonoscopy Landmark Detection using Vision Transformers

Aniruddha Tamhane; Tse'ela Mida; Erez Posner; Moshe Bouhnik

arXiv:2209.11304·cs.CV·September 28, 2022

Colonoscopy Landmark Detection using Vision Transformers

Aniruddha Tamhane, Tse'ela Mida, Erez Posner, Moshe Bouhnik

PDF

Open Access

TL;DR

This paper introduces a vision-transformer based algorithm for automatic detection of key anatomical landmarks in colonoscopy images, aiming to streamline post-procedure documentation and improve clinical workflow.

Contribution

It presents a novel landmark detection method using vision transformers trained on a new dataset of colonoscopy snapshots, outperforming traditional CNN backbones.

Findings

01

Achieved 82% accuracy with vision transformer on test data.

02

Compared transformer backbone with ResNet-101 and ConvNext-B, demonstrating competitive performance.

03

Developed an adaptive gamma correction preprocessing step for consistent image brightness.

Abstract

Colonoscopy is a routine outpatient procedure used to examine the colon and rectum for any abnormalities including polyps, diverticula and narrowing of colon structures. A significant amount of the clinician's time is spent in post-processing snapshots taken during the colonoscopy procedure, for maintaining medical records or further investigation. Automating this step can save time and improve the efficiency of the process. In our work, we have collected a dataset of 120 colonoscopy videos and 2416 snapshots taken during the procedure, that have been annotated by experts. Further, we have developed a novel, vision-transformer based landmark detection algorithm that identifies key anatomical landmarks (the appendiceal orifice, ileocecal valve/cecum landmark and rectum retroflexion) from snapshots taken during colonoscopy. Our algorithm uses an adaptive gamma correction during…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsColorectal Cancer Screening and Detection

MethodsAttention Is All You Need · Test · Linear Layer · Softmax · Residual Connection · Dense Connections · Multi-Head Attention · Layer Normalization · Vision Transformer