TL;DR
This paper introduces a novel method that combines attention prediction with object detection to identify critical objects in driving scenes efficiently, bridging the gap between pixel-level and object-level attention understanding.
Contribution
It integrates an attention prediction module into a pretrained object detection framework to predict attention in a grid-based style and recognize critical objects based on attended areas.
Findings
Achieves state-of-the-art performance in attention prediction on two datasets.
Reduces computational cost by 75.3 GFLOPs compared to previous methods.
Effectively identifies critical objects using predicted attention areas.
Abstract
Human drivers use their attentional mechanisms to focus on critical objects and make decisions while driving. As human attention can be revealed from gaze data, capturing and analyzing gaze information has emerged in recent years to benefit autonomous driving technology. Previous works in this context have primarily aimed at predicting "where" human drivers look at and lack knowledge of "what" objects drivers focus on. Our work bridges the gap between pixel-level and object-level attention prediction. Specifically, we propose to integrate an attention prediction module into a pretrained object detection framework and predict the attention in a grid-based style. Furthermore, critical objects are recognized based on predicted attended-to areas. We evaluate our proposed method on two driver attention datasets, BDD-A and DR(eye)VE. Our framework achieves competitive state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
