Multi-frame Feature Aggregation for Real-time Instrument Segmentation in   Endoscopic Video

Shan Lin; Fangbo Qin; Haonan Peng; Randall A. Bly; Kris S. Moe; Blake; Hannaford

arXiv:2011.08752·cs.CV·July 27, 2021

Multi-frame Feature Aggregation for Real-time Instrument Segmentation in Endoscopic Video

Shan Lin, Fangbo Qin, Haonan Peng, Randall A. Bly, Kris S. Moe, Blake, Hannaford

PDF

TL;DR

This paper introduces a lightweight multi-frame feature aggregation method for real-time surgical instrument segmentation, reducing computation costs and improving accuracy in challenging surgical video conditions.

Contribution

The paper proposes a novel MFFA module that aggregates features temporally and spatially, enabling efficient, real-time segmentation with less computational load.

Findings

01

Outperforms deeper models on public datasets

02

Reduces computation costs with lightweight encoder

03

Effective in challenging lighting and blood conditions

Abstract

Deep learning-based methods have achieved promising results on surgical instrument segmentation. However, the high computation cost may limit the application of deep models to time-sensitive tasks such as online surgical video analysis for robotic-assisted surgery. Moreover, current methods may still suffer from challenging conditions in surgical images such as various lighting conditions and the presence of blood. We propose a novel Multi-frame Feature Aggregation (MFFA) module to aggregate video frame features temporally and spatially in a recurrent mode. By distributing the computation load of deep feature extraction over sequential frames, we can use a lightweight encoder to reduce the computation costs at each time step. Moreover, public surgical videos usually are not labeled frame by frame, so we develop a method that can randomly synthesize a surgical frame sequence from a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.