Can Reasons Help Improve Pedestrian Intent Estimation? A Cross-Modal   Approach

Vaishnavi Khindkar; Vineeth Balasubramanian; Chetan Arora; Anbumani; Subramanian; C.V. Jawahar

arXiv:2411.13302·cs.CV·November 21, 2024

Can Reasons Help Improve Pedestrian Intent Estimation? A Cross-Modal Approach

Vaishnavi Khindkar, Vineeth Balasubramanian, Chetan Arora, Anbumani, Subramanian, C.V. Jawahar

PDF

Open Access

TL;DR

This paper introduces a cross-modal approach that predicts pedestrian intent and the reasons behind it, enhancing safety in autonomous navigation by making predictions more interpretable and accurate.

Contribution

It presents a novel dataset with textual explanations for pedestrian intent and a multi-task framework that jointly predicts intent and reasons, improving prediction accuracy.

Findings

01

5.6% and 7% accuracy and F1-score improvements on PIE++ dataset

02

4.4% accuracy improvement on JAAD dataset

03

Effective in providing human-understandable explanations for pedestrian intent

Abstract

With the increased importance of autonomous navigation systems has come an increasing need to protect the safety of Vulnerable Road Users (VRUs) such as pedestrians. Predicting pedestrian intent is one such challenging task, where prior work predicts the binary cross/no-cross intention with a fusion of visual and motion features. However, there has been no effort so far to hedge such predictions with human-understandable reasons. We address this issue by introducing a novel problem setting of exploring the intuitive reasoning behind a pedestrian's intent. In particular, we show that predicting the 'WHY' can be very useful in understanding the 'WHAT'. To this end, we propose a novel, reason-enriched PIE++ dataset consisting of multi-label textual explanations/reasons for pedestrian intent. We also introduce a novel multi-task learning framework called MINDREAD, which leverages a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTraffic and Road Safety · Evacuation and Crowd Dynamics