AttZoom: Attention Zoom for Better Visual Features
Daniel DeAlcala, Aythami Morales, Julian Fierrez, Ruben Tolosana

TL;DR
Attention Zoom is a versatile spatial attention layer that enhances feature extraction in CNNs, leading to improved classification accuracy and more detailed attention patterns across various models and datasets.
Contribution
We introduce a modular, architecture-agnostic spatial attention layer called Attention Zoom that improves CNN feature extraction without significant overhead.
Findings
Consistent accuracy improvements on CIFAR-100 and TinyImageNet
Encourages fine-grained, diverse attention patterns
Effective across multiple CNN architectures
Abstract
We present Attention Zoom, a modular and model-agnostic spatial attention mechanism designed to improve feature extraction in convolutional neural networks (CNNs). Unlike traditional attention approaches that require architecture-specific integration, our method introduces a standalone layer that spatially emphasizes high-importance regions in the input. We evaluated Attention Zoom on multiple CNN backbones using CIFAR-100 and TinyImageNet, showing consistent improvements in Top-1 and Top-5 classification accuracy. Visual analyses using Grad-CAM and spatial warping reveal that our method encourages fine-grained and diverse attention patterns. Our results confirm the effectiveness and generality of the proposed layer for improving CCNs with minimal architectural overhead.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
