Strip Pooling: Rethinking Spatial Pooling for Scene Parsing
Qibin Hou, Li Zhang, Ming-Ming Cheng, Jiashi Feng

TL;DR
This paper introduces strip pooling, a novel spatial pooling method with long, narrow kernels, improving scene parsing by capturing long-range dependencies more efficiently and achieving state-of-the-art results on benchmarks.
Contribution
The paper proposes a new strip pooling strategy and architecture design that enhances long-range dependency modeling in scene parsing networks, serving as an efficient plug-and-play module.
Findings
Achieves new state-of-the-art results on ADE20K and Cityscapes benchmarks.
The proposed modules are lightweight and easily integrated into existing networks.
Extensive experiments validate the effectiveness of strip pooling over conventional methods.
Abstract
Spatial pooling has been proven highly effective in capturing long-range contextual information for pixel-wise prediction tasks, such as scene parsing. In this paper, beyond conventional spatial pooling that usually has a regular shape of NxN, we rethink the formulation of spatial pooling by introducing a new pooling strategy, called strip pooling, which considers a long but narrow kernel, i.e., 1xN or Nx1. Based on strip pooling, we further investigate spatial pooling architecture design by 1) introducing a new strip pooling module that enables backbone networks to efficiently model long-range dependencies, 2) presenting a novel building block with diverse spatial pooling as a core, and 3) systematically comparing the performance of the proposed strip pooling and conventional spatial pooling techniques. Both novel pooling-based designs are lightweight and can serve as an efficient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Strip Pooling: Rethinking Spatial Pooling for Scene Parsing· youtube
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
MethodsStrip Pooling Network · Strip Pooling
