Attention-SLAM: A Visual Monocular SLAM Learning from Human Gaze
Jinquan Li, Ling Pei, Danping Zou, Songpengcheng Xia, Qi Wu, Tao Li,, Zhen Sun, Wenxian Yu

TL;DR
Attention-SLAM integrates human-like visual saliency modeling with monocular SLAM to prioritize salient features, improving localization accuracy and robustness by mimicking human navigation behavior.
Contribution
The paper introduces SalNavNet with correlation and EMA modules for improved saliency detection and incorporates semantic saliency into SLAM, enhancing performance over existing methods.
Findings
Outperforms DSO, ORB-SLAM, and Salient DSO in accuracy and robustness
Uses saliency maps to prioritize features during SLAM optimization
Provides an open-source saliency SLAM dataset
Abstract
This paper proposes a novel simultaneous localization and mapping (SLAM) approach, namely Attention-SLAM, which simulates human navigation mode by combining a visual saliency model (SalNavNet) with traditional monocular visual SLAM. Most SLAM methods treat all the features extracted from the images as equal importance during the optimization process. However, the salient feature points in scenes have more significant influence during the human navigation process. Therefore, we first propose a visual saliency model called SalVavNet in which we introduce a correlation module and propose an adaptive Exponential Moving Average (EMA) module. These modules mitigate the center bias to enable the saliency maps generated by SalNavNet to pay more attention to the same salient object. Moreover, the saliency maps simulate the human behavior for the refinement of SLAM results. The feature points…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Visual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques
