TL;DR
This paper introduces Tiled Squeeze-and-Excite (TSE), a local context-based channel attention mechanism that matches global context performance with significantly reduced computational requirements, enabling efficient deployment.
Contribution
The paper proposes TSE, a novel local context-based extension of SE blocks, which can replace global pooling with multiple local descriptors without retraining, improving efficiency.
Findings
TSE achieves comparable accuracy to global SE blocks.
TSE reduces activation buffering by up to 90%.
TSE is compatible with existing SE networks without retraining.
Abstract
In this paper we investigate the amount of spatial context required for channel attention. To this end we study the popular squeeze-and-excite (SE) block which is a simple and lightweight channel attention mechanism. SE blocks and its numerous variants commonly use global average pooling (GAP) to create a single descriptor for each channel. Here, we empirically analyze the amount of spatial context needed for effective channel attention and find that limited localcontext on the order of seven rows or columns of the original image is sufficient to match the performance of global context. We propose tiled squeeze-and-excite (TSE), which is a framework for building SE-like blocks that employ several descriptors per channel, with each descriptor based on local context only. We further show that TSE is a drop-in replacement for the SE block and can be used in existing SE networks without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAverage Pooling · Global Average Pooling
