Top-Down Saliency Detection Driven by Visual Classification
Francesca Murabito, Concetto Spampinato, Simone Palazzo, Konstantin, Pogorelov, Michael Riegler

TL;DR
This paper introduces SalClassNet, a CNN framework that jointly learns top-down saliency maps and visual classification, demonstrating improved performance over existing methods and better generalization in fine-grained recognition tasks.
Contribution
The paper proposes a novel CNN framework, SalClassNet, that jointly learns top-down saliency detection and visual classification, outperforming existing saliency detectors and enhancing classification accuracy.
Findings
SalClassNet outperforms state-of-the-art saliency detectors.
Conditioning saliency detection on object classes improves performance.
Explicit top-down saliency maps enhance visual classification accuracy.
Abstract
This paper presents an approach for top-down saliency detection guided by visual classification tasks. We first learn how to compute visual saliency when a specific visual task has to be accomplished, as opposed to most state-of-the-art methods which assess saliency merely through bottom-up principles. Afterwards, we investigate if and to what extent visual saliency can support visual classification in nontrivial cases. To achieve this, we propose SalClassNet, a CNN framework consisting of two networks jointly trained: a) the first one computing top-down saliency maps from input images, and b) the second one exploiting the computed saliency maps for visual classification. To test our approach, we collected a dataset of eye-gaze maps, using a Tobii T60 eye tracker, by asking several subjects to look at images from the Stanford Dogs dataset, with the objective of distinguishing dog…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
