ASI-Seg: Audio-Driven Surgical Instrument Segmentation with Surgeon Intention Understanding
Zhen Chen, Zongming Zhang, Wenwu Guo, Xingjian Luo, Long Bai, Jinlin, Wu, Hongliang Ren, Hongbin Liu

TL;DR
ASI-Seg is an innovative audio-driven segmentation framework that interprets surgeon commands to accurately segment specific surgical instruments, enhancing surgical workflow and reducing cognitive load.
Contribution
The paper introduces a novel multimodal fusion and contrastive learning approach for intention-oriented instrument segmentation based on audio commands during surgery.
Findings
Outperforms state-of-the-art segmentation models in accuracy.
Effectively interprets surgeon intentions from audio commands.
Reduces irrelevant instrument segmentation in surgical scenes.
Abstract
Surgical instrument segmentation is crucial in surgical scene understanding, thereby facilitating surgical safety. Existing algorithms directly detected all instruments of pre-defined categories in the input image, lacking the capability to segment specific instruments according to the surgeon's intention. During different stages of surgery, surgeons exhibit varying preferences and focus toward different surgical instruments. Therefore, an instrument segmentation algorithm that adheres to the surgeon's intention can minimize distractions from irrelevant instruments and assist surgeons to a great extent. The recent Segment Anything Model (SAM) reveals the capability to segment objects following prompts, but the manual annotations for prompts are impractical during the surgery. To address these limitations in operating rooms, we propose an audio-driven surgical instrument segmentation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Imaging in Medicine
MethodsContrastive Learning · Focus
