ACT-Net: Anchor-context Action Detection in Surgery Videos

Luoying Hao; Yan Hu; Wenjun Lin; Qun Wang; Heng Li; Huazhu Fu; Jinming; Duan; and Jiang Liu

arXiv:2310.03377·cs.CV·October 6, 2023

ACT-Net: Anchor-context Action Detection in Surgery Videos

Luoying Hao, Yan Hu, Wenjun Lin, Qun Wang, Heng Li, Huazhu Fu, Jinming, Duan, and Jiang Liu

PDF

Open Access

TL;DR

This paper introduces ACTNet, a novel surgical action detection network that leverages anchor-context interactions and diffusion models to improve accuracy and confidence estimation in surgical videos.

Contribution

The paper proposes ACTNet with an anchor-context detection module and a class conditional diffusion module, enhancing surgical action detection accuracy and confidence estimation.

Findings

01

Achieved 4.0% mAP improvement over baseline.

02

State-of-the-art performance on surgical video dataset.

03

Effective confidence estimation via diffusion model outputs.

Abstract

Recognition and localization of surgical detailed actions is an essential component of developing a context-aware decision support system. However, most existing detection algorithms fail to provide high-accuracy action classes even having their locations, as they do not consider the surgery procedure's regularity in the whole video. This limitation hinders their application. Moreover, implementing the predictions in clinical applications seriously needs to convey model confidence to earn entrustment, which is unexplored in surgical action prediction. In this paper, to accurately detect fine-grained actions that happen at every moment, we propose an anchor-context action detection network (ACTNet), including an anchor-context detection (ACD) module and a class conditional diffusion (CCD) module, to answer the following questions: 1) where the actions happen; 2) what actions are; 3) how…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSurgical Simulation and Training · Medical Imaging and Analysis

MethodsDiffusion