Cost-Efficient Multi-Scale Fovea for Semantic-Based Visual Search Attention
Jo\~ao Luzio, Alexandre Bernardino, and Plinio Moreno

TL;DR
This paper introduces a multi-scale fovea module for semantic-based visual search that reduces computational costs and improves accuracy, inspired by biological vision and classical topologies.
Contribution
It presents a novel multi-scale fovea integrated into the SemBA framework, enhancing efficiency and biological plausibility in visual attention prediction.
Findings
Reduces detection-related computational costs.
Improves scanpath prediction accuracy.
Closely approximates human visual consistency.
Abstract
Semantics are one of the primary sources of top-down preattentive information. Modern deep object detectors excel at extracting such valuable semantic cues from complex visual scenes. However, the size of the visual input to be processed by these detectors can become a bottleneck, particularly in terms of time costs, affecting an artificial attention system's biological plausibility and real-time deployability. Inspired by classical exponential density roll-off topologies, we apply a new artificial foveation module to our novel attention prediction pipeline: the Semantic-based Bayesian Attention (SemBA) framework. We aim at reducing detection-related computational costs without compromising visual task accuracy, thereby making SemBA more biologically plausible. The proposed multi-scale pyramidal field-of-view retains maximum acuity at an innermost level, around a focal point, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
