ARST: Auto-Regressive Surgical Transformer for Phase Recognition from   Laparoscopic Videos

Xiaoyang Zou; Wenyong Liu; Junchen Wang; Rong Tao; Guoyan Zheng

arXiv:2209.01148·cs.CV·September 5, 2022·1 cites

ARST: Auto-Regressive Surgical Transformer for Phase Recognition from Laparoscopic Videos

Xiaoyang Zou, Wenyong Liu, Junchen Wang, Rong Tao, Guoyan Zheng

PDF

Open Access

TL;DR

This paper introduces ARST, an auto-regressive transformer model for real-time surgical phase recognition from laparoscopic videos, improving accuracy and consistency over existing methods.

Contribution

It is the first to incorporate auto-regression into transformer-based surgical phase recognition, enhancing phase correlation modeling and inference stability.

Findings

01

Outperforms state-of-the-art methods in accuracy.

02

Achieves 66 fps inference rate.

03

Demonstrates improved phase consistency.

Abstract

Phase recognition plays an essential role for surgical workflow analysis in computer assisted intervention. Transformer, originally proposed for sequential data modeling in natural language processing, has been successfully applied to surgical phase recognition. Existing works based on transformer mainly focus on modeling attention dependency, without introducing auto-regression. In this work, an Auto-Regressive Surgical Transformer, referred as ARST, is first proposed for on-line surgical phase recognition from laparoscopic videos, modeling the inter-phase correlation implicitly by conditional probability distribution. To reduce inference bias and to enhance phase consistency, we further develop a consistency constraint inference strategy based on auto-regression. We conduct comprehensive validations on a well-known public dataset Cholec80. Experimental results show that our method…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSurgical Simulation and Training · Colorectal Cancer Screening and Detection

MethodsLinear Layer · Layer Normalization · Softmax · Absolute Position Encodings · Adam · Position-Wise Feed-Forward Layer · Multi-Head Attention · Dense Connections · Label Smoothing · Dropout