DACAT: Dual-stream Adaptive Clip-aware Time Modeling for Robust Online   Surgical Phase Recognition

Kaixiang Yang; Qiang Li; Zhiwei Wang

arXiv:2409.06217·cs.CV·September 11, 2024

DACAT: Dual-stream Adaptive Clip-aware Time Modeling for Robust Online Surgical Phase Recognition

Kaixiang Yang, Qiang Li, Zhiwei Wang

PDF

Open Access 1 Repo

TL;DR

DACAT introduces a dual-stream, clip-aware time modeling approach for online surgical phase recognition, significantly improving accuracy by adaptively leveraging historical context and current frame features.

Contribution

The paper proposes DACAT, a novel dual-stream model with adaptive clip-aware context encoding, enhancing temporal modeling for surgical phase recognition over existing methods.

Findings

01

Outperforms state-of-the-art methods on three datasets.

02

Achieves at least 2.7-4.6% higher Jaccard scores.

03

Demonstrates robust online surgical phase recognition.

Abstract

Surgical phase recognition has become a crucial requirement in laparoscopic surgery, enabling various clinical applications like surgical risk forecasting. Current methods typically identify the surgical phase using individual frame-wise embeddings as the fundamental unit for time modeling. However, this approach is overly sensitive to current observations, often resulting in discontinuous and erroneous predictions within a complete surgical phase. In this paper, we propose DACAT, a novel dual-stream model that adaptively learns clip-aware context information to enhance the temporal relationship. In one stream, DACAT pretrains a frame encoder, caching all historical frame-wise features. In the other stream, DACAT fine-tunes a new frame encoder to extract the frame-wise feature at the current moment. Additionally, a max clip-response read-out (Max-R) module is introduced to bridge the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kk42yy/dacat
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReservoir Engineering and Simulation Methods

MethodsContrastive Language-Image Pre-training