OperA: Attention-Regularized Transformers for Surgical Phase Recognition

Tobias Czempiel; Magdalini Paschali; Daniel Ostler; Seong Tae Kim,; Benjamin Busam; Nassir Navab

arXiv:2103.03873·cs.CV·November 24, 2022

OperA: Attention-Regularized Transformers for Surgical Phase Recognition

Tobias Czempiel, Magdalini Paschali, Daniel Ostler, Seong Tae Kim,, Benjamin Busam, Nassir Navab

PDF

TL;DR

OperA is a transformer-based model with attention regularization that improves surgical phase recognition from videos and identifies key frames for summarization, outperforming existing methods.

Contribution

Introduces OperA, a novel attention-regularized transformer model for accurate surgical phase recognition and key frame identification in surgical videos.

Findings

01

OperA outperforms state-of-the-art methods on laparoscopic cholecystectomy datasets.

02

Attention regularization improves focus on high-quality frames.

03

High attention frames effectively characterize surgical phases.

Abstract

In this paper we introduce OperA, a transformer-based model that accurately predicts surgical phases from long video sequences. A novel attention regularization loss encourages the model to focus on high-quality frames during training. Moreover, the attention weights are utilized to identify characteristic high attention frames for each surgical phase, which could further be used for surgery summarization. OperA is thoroughly evaluated on two datasets of laparoscopic cholecystectomy videos, outperforming various state-of-the-art temporal refinement approaches.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.