DeepPyram: Enabling Pyramid View and Deformable Pyramid Reception for Semantic Segmentation in Cataract Surgery Videos
Negin Ghamsarian, Mario Taschwer, and klaus Schoeffmann

TL;DR
DeepPyram is a novel neural network designed for improved semantic segmentation in cataract surgery videos, utilizing pyramid view fusion, deformable receptive fields, and adaptive multi-scale supervision to handle challenging object features.
Contribution
The paper introduces DeepPyram, a new segmentation network with three innovative modules that enhance performance on complex surgical video data without increasing trainable parameters.
Findings
DeepPyram outperforms 13 state-of-the-art networks on cataract surgery datasets.
The proposed modules significantly improve segmentation accuracy for deformable and transparent objects.
Ablation studies confirm the effectiveness of each module in the network.
Abstract
Semantic segmentation in cataract surgery has a wide range of applications contributing to surgical outcome enhancement and clinical risk reduction. However, the varying issues in segmenting the different relevant instances make the designation of a unique network quite challenging. This paper proposes a semantic segmentation network termed as DeepPyram that can achieve superior performance in segmenting relevant objects in cataract surgery videos with varying issues. This superiority mainly originates from three modules: (i) Pyramid View Fusion, which provides a varying-angle global view of the surrounding region centering at each pixel position in the input convolutional feature map; (ii) Deformable Pyramid Reception, which enables a wide deformable receptive field that can adapt to geometric transformations in the object of interest; and (iii) Pyramid Loss that adaptively supervises…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRetinal Imaging and Analysis · Medical Imaging and Analysis · Advanced Neural Network Applications
