Recognition and 3D Localization of Pedestrian Actions from Monocular   Video

Jun Hayakawa; Behzad Dariush

arXiv:2008.01162·cs.CV·January 25, 2021

Recognition and 3D Localization of Pedestrian Actions from Monocular Video

Jun Hayakawa, Behzad Dariush

PDF

TL;DR

This paper presents a novel framework for recognizing pedestrian actions and estimating their 3D location from monocular video, enhancing prediction of pedestrian behavior in urban traffic scenarios.

Contribution

It introduces a two-stream temporal relation network leveraging pose and RGB data for improved action recognition and a new loss-based network for 3D localization from monocular views.

Findings

01

Outperforms single-stream methods in action recognition on JAAD dataset.

02

Reduces average localization error on KITTI dataset.

03

Demonstrates effective qualitative results on H3D driving dataset.

Abstract

Understanding and predicting pedestrian behavior is an important and challenging area of research for realizing safe and effective navigation strategies in automated and advanced driver assistance technologies in urban scenes. This paper focuses on monocular pedestrian action recognition and 3D localization from an egocentric view for the purpose of predicting intention and forecasting future trajectory. A challenge in addressing this problem in urban traffic scenes is attributed to the unpredictable behavior of pedestrians, whereby actions and intentions are constantly in flux and depend on the pedestrians pose, their 3D spatial relations, and their interaction with other agents as well as with the environment. To partially address these challenges, we consider the importance of pose toward recognition and 3D localization of pedestrian actions. In particular, we propose an action…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.