No More Sliding Window: Efficient 3D Medical Image Segmentation with Differentiable Top-k Patch Sampling

Young Seok Jeon; Hongfei Yang; Huazhu Fu; Mengling Feng

arXiv:2501.10814·eess.IV·June 30, 2025

No More Sliding Window: Efficient 3D Medical Image Segmentation with Differentiable Top-k Patch Sampling

Young Seok Jeon, Hongfei Yang, Huazhu Fu, Mengling Feng

PDF

Open Access

TL;DR

This paper introduces NMSW, an end-to-end framework that eliminates sliding window inference in 3D medical image segmentation, significantly reducing computation and inference time while maintaining accuracy.

Contribution

The paper proposes a differentiable Top-k patch sampling method to replace sliding window inference, enabling faster and more efficient 3D segmentation without sacrificing accuracy.

Findings

01

Achieves 91% reduction in computational complexity

02

Delivers up to 11.1x faster inference on CPU

03

Maintains competitive segmentation accuracy

Abstract

3D models surpass 2D models in CT/MRI segmentation by effectively capturing inter-slice relationships. However, the added depth dimension substantially increases memory consumption. While patch-based training alleviates memory constraints, it significantly slows down the inference speed due to the sliding window (SW) approach. We propose No-More-Sliding-Window (NMSW), a novel end-to-end trainable framework that enhances the efficiency of generic 3D segmentation backbone during an inference step by eliminating the need for SW. NMSW employs a differentiable Top-k module to selectively sample only the most relevant patches, thereby minimizing redundant computations. When patch-level predictions are insufficient, the framework intelligently leverages coarse global predictions to refine results. Evaluated across 3 tasks using 3 segmentation backbones, NMSW achieves competitive accuracy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMedical Image Segmentation Techniques · Medical Imaging Techniques and Applications · Computer Graphics and Visualization Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings