Is Bottom-Up Attention Useful for Scene Recognition?
Samuel F. Dodge, Lina J. Karam

TL;DR
This paper explores how bottom-up attention mechanisms, like saliency, can enhance scene recognition by improving accuracy and computational efficiency, especially with limited training data.
Contribution
It introduces a novel method combining salient and non-salient regions via Multiple Kernel Learning for better scene classification.
Findings
Pruning achieves high computational savings with minimal accuracy loss.
Saliency weighting alone does not improve classification performance.
The proposed method outperforms baseline approaches on the UIUC sports dataset with small training sets.
Abstract
The human visual system employs a selective attention mechanism to understand the visual world in an eficient manner. In this paper, we show how computational models of this mechanism can be exploited for the computer vision application of scene recognition. First, we consider saliency weighting and saliency pruning, and provide a comparison of the performance of different attention models in these approaches in terms of classification accuracy. Pruning can achieve a high degree of computational savings without significantly sacrificing classification accuracy. In saliency weighting, however, we found that classification performance does not improve. In addition, we present a new method to incorporate salient and non-salient regions for improved classification accuracy. We treat the salient and non-salient regions separately and combine them using Multiple Kernel Learning. We evaluate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques · Advanced Neural Network Applications
