Attention-Guided Lightweight Network for Real-Time Segmentation of Robotic Surgical Instruments
Zhen-Liang Ni, Gui-Bin Bian, Zeng-Guang Hou, Xiao-Hu Zhou, Xiao-Liang, Xie, Zhen Li

TL;DR
This paper introduces LWANet, a lightweight, attention-guided neural network capable of real-time segmentation of surgical instruments with high accuracy and low computational cost, suitable for robot-assisted surgery.
Contribution
The paper presents LWANet, a novel lightweight encoder-decoder network with attention fusion for real-time surgical instrument segmentation, achieving state-of-the-art accuracy with minimal computational resources.
Findings
Achieves 39 fps inference speed on 960x544 images.
Attains 94.10% mean IOU on Cata7 dataset.
Sets new record with 4.10% higher mean IOU on EndoVis 2017.
Abstract
The real-time segmentation of surgical instruments plays a crucial role in robot-assisted surgery. However, it is still a challenging task to implement deep learning models to do real-time segmentation for surgical instruments due to their high computational costs and slow inference speed. In this paper, we propose an attention-guided lightweight network (LWANet), which can segment surgical instruments in real-time. LWANet adopts encoder-decoder architecture, where the encoder is the lightweight network MobileNetV2, and the decoder consists of depthwise separable convolution, attention fusion block, and transposed convolution. Depthwise separable convolution is used as the basic unit to construct the decoder, which can reduce the model size and computational costs. Attention fusion block captures global contexts and encodes semantic dependencies between channels to emphasize target…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSurgical Simulation and Training · Anatomy and Medical Technology · Medical Imaging and Analysis
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Transposed convolution · 1x1 Convolution · Batch Normalization · Inverted Residual Block · Average Pooling · Tether Customer Service Number +1-833-534-1729 · Depthwise Convolution · Pointwise Convolution · Depthwise Separable Convolution
