Bifold and Semantic Reasoning for Pedestrian Behavior Prediction
Amir Rasouli, Mohsen Rohani, Jun Luo

TL;DR
This paper introduces BiPed, a multitask learning framework that enhances pedestrian behavior prediction by combining bifold encoding, semantic interaction modeling, and dual decoding, achieving state-of-the-art results on benchmark datasets.
Contribution
The paper presents a novel bifold encoding and reasoning approach that improves multimodal pedestrian behavior prediction in driving scenarios.
Findings
Achieves up to 22% improvement in trajectory prediction
Achieves up to 9% improvement in action prediction
Outperforms existing methods on PIE and JAAD datasets
Abstract
Pedestrian behavior prediction is one of the major challenges for intelligent driving systems. Pedestrians often exhibit complex behaviors influenced by various contextual elements. To address this problem, we propose BiPed, a multitask learning framework that simultaneously predicts trajectories and actions of pedestrians by relying on multimodal data. Our method benefits from 1) a bifold encoding approach where different data modalities are processed independently allowing them to develop their own representations, and jointly to produce a representation for all modalities using shared parameters; 2) a novel interaction modeling technique that relies on categorical semantic parsing of the scenes to capture interactions between target pedestrians and their surroundings; and 3) a bifold prediction mechanism that uses both independent and shared decoding of multimodal representations.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Human Pose and Action Recognition · Traffic and Road Safety
