ASI-Seg: Audio-Driven Surgical Instrument Segmentation with Surgeon   Intention Understanding

Zhen Chen; Zongming Zhang; Wenwu Guo; Xingjian Luo; Long Bai; Jinlin; Wu; Hongliang Ren; Hongbin Liu

arXiv:2407.19435·cs.CV·July 30, 2024

ASI-Seg: Audio-Driven Surgical Instrument Segmentation with Surgeon Intention Understanding

Zhen Chen, Zongming Zhang, Wenwu Guo, Xingjian Luo, Long Bai, Jinlin, Wu, Hongliang Ren, Hongbin Liu

PDF

Open Access 1 Repo

TL;DR

ASI-Seg is an innovative audio-driven segmentation framework that interprets surgeon commands to accurately segment specific surgical instruments, enhancing surgical workflow and reducing cognitive load.

Contribution

The paper introduces a novel multimodal fusion and contrastive learning approach for intention-oriented instrument segmentation based on audio commands during surgery.

Findings

01

Outperforms state-of-the-art segmentation models in accuracy.

02

Effectively interprets surgeon intentions from audio commands.

03

Reduces irrelevant instrument segmentation in surgical scenes.

Abstract

Surgical instrument segmentation is crucial in surgical scene understanding, thereby facilitating surgical safety. Existing algorithms directly detected all instruments of pre-defined categories in the input image, lacking the capability to segment specific instruments according to the surgeon's intention. During different stages of surgery, surgeons exhibit varying preferences and focus toward different surgical instruments. Therefore, an instrument segmentation algorithm that adheres to the surgeon's intention can minimize distractions from irrelevant instruments and assist surgeons to a great extent. The recent Segment Anything Model (SAM) reveals the capability to segment objects following prompts, but the manual annotations for prompts are impractical during the surgery. To address these limitations in operating rooms, we propose an audio-driven surgical instrument segmentation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zonmgin-zhang/asi-seg
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Imaging in Medicine

MethodsContrastive Learning · Focus