Can Reasons Help Improve Pedestrian Intent Estimation? A Cross-Modal Approach
Vaishnavi Khindkar, Vineeth Balasubramanian, Chetan Arora, Anbumani, Subramanian, C.V. Jawahar

TL;DR
This paper introduces a cross-modal approach that predicts pedestrian intent and the reasons behind it, enhancing safety in autonomous navigation by making predictions more interpretable and accurate.
Contribution
It presents a novel dataset with textual explanations for pedestrian intent and a multi-task framework that jointly predicts intent and reasons, improving prediction accuracy.
Findings
5.6% and 7% accuracy and F1-score improvements on PIE++ dataset
4.4% accuracy improvement on JAAD dataset
Effective in providing human-understandable explanations for pedestrian intent
Abstract
With the increased importance of autonomous navigation systems has come an increasing need to protect the safety of Vulnerable Road Users (VRUs) such as pedestrians. Predicting pedestrian intent is one such challenging task, where prior work predicts the binary cross/no-cross intention with a fusion of visual and motion features. However, there has been no effort so far to hedge such predictions with human-understandable reasons. We address this issue by introducing a novel problem setting of exploring the intuitive reasoning behind a pedestrian's intent. In particular, we show that predicting the 'WHY' can be very useful in understanding the 'WHAT'. To this end, we propose a novel, reason-enriched PIE++ dataset consisting of multi-label textual explanations/reasons for pedestrian intent. We also introduce a novel multi-task learning framework called MINDREAD, which leverages a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic and Road Safety · Evacuation and Crowd Dynamics
