Feature Importance in Pedestrian Intention Prediction: A Context-Aware Review
Mohsen Azarmi, Mahdi Rezaei, He Wang, Ali Arabian

TL;DR
This paper introduces CAPFI, a context-aware method for interpreting feature importance in pedestrian intention prediction models, highlighting the roles of specific features and proposing improved representations for better accuracy.
Contribution
The paper presents CAPFI, a novel context-aware permutation importance method tailored for pedestrian intention prediction, and demonstrates its effectiveness on the PIE dataset with insights into feature roles.
Findings
Pedestrian bounding boxes and ego-vehicle speed are critical features.
Contextual differences significantly affect model performance.
Proposed feature change improves intention prediction robustness.
Abstract
Recent advancements in predicting pedestrian crossing intentions for Autonomous Vehicles using Computer Vision and Deep Neural Networks are promising. However, the black-box nature of DNNs poses challenges in understanding how the model works and how input features contribute to final predictions. This lack of interpretability delimits the trust in model performance and hinders informed decisions on feature selection, representation, and model optimisation; thereby affecting the efficacy of future research in the field. To address this, we introduce Context-aware Permutation Feature Importance (CAPFI), a novel approach tailored for pedestrian intention prediction. CAPFI enables more interpretability and reliable assessments of feature importance by leveraging subdivided scenario contexts, mitigating the randomness of feature values through targeted shuffling. This aims to reduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Traffic and Road Safety · Evacuation and Crowd Dynamics
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
