Where to Look Next: Unsupervised Active Visual Exploration on 360{\deg} Input
Soroush Seifi, Tinne Tuytelaars

TL;DR
This paper introduces an unsupervised active visual exploration method for 360-degree environments, utilizing spatial memory and retina-like glimpses to efficiently classify scenes without deep reinforcement learning.
Contribution
It presents a novel approach that outperforms previous methods by avoiding complex training procedures and leveraging spatial memory and adaptive glimpses.
Findings
Significant performance improvement over prior methods.
Effective scene classification from limited glimpses.
Advantages of retina-like glimpses under bandwidth constraints.
Abstract
We address the problem of active visual exploration of large 360{\deg} inputs. In our setting an active agent with a limited camera bandwidth explores its 360{\deg} environment by changing its viewing direction at limited discrete time steps. As such, it observes the world as a sequence of narrow field-of-view 'glimpses', deciding for itself where to look next. Our proposed method exceeds previous works' performance by a significant margin without the need for deep reinforcement learning or training separate networks as sidekicks. A key component of our system are the spatial memory maps that make the system aware of the glimpses' orientations (locations in the 360{\deg} image). Further, we stress the advantages of retina-like glimpses when the agent's sensor bandwidth and time-steps are limited. Finally, we use our trained model to do classification of the whole scene using only the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Visual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques
