Attend and Segment: Attention Guided Active Semantic Segmentation
Soroush Seifi, Tinne Tuytelaars

TL;DR
This paper introduces an attention-guided active segmentation method that incrementally refines scene understanding from limited partial observations, achieving high accuracy with minimal visual input in dynamic environments.
Contribution
It proposes a novel self-supervised attention mechanism and spatial memory architecture for active, incremental semantic segmentation from partial scene observations.
Findings
Achieves 78.1% accuracy on CityScapes with 18% pixels processed
Effective in filling unseen areas using spatial memory and attention
Outperforms baseline methods with low-resolution initial views
Abstract
In a dynamic environment, an agent with a limited field of view/resource cannot fully observe the scene before attempting to parse it. The deployment of common semantic segmentation architectures is not feasible in such settings. In this paper we propose a method to gradually segment a scene given a sequence of partial observations. The main idea is to refine an agent's understanding of the environment by attending the areas it is most uncertain about. Our method includes a self-supervised attention mechanism and a specialized architecture to maintain and exploit spatial memory maps for filling-in the unseen areas in the environment. The agent can select and attend an area while relying on the cues coming from the visited areas to hallucinate the other parts. We reach a mean pixel-wise accuracy of 78.1%, 80.9% and 76.5% on CityScapes, CamVid, and Kitti datasets by processing only 18% of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
