Learning Road Scene-level Representations via Semantic Region Prediction
Zihao Xiao, Alan Yuille, Yi-Ting Chen

TL;DR
This paper introduces a novel semantic region prediction task to learn scene-level representations that improve driver intent prediction and risk object identification in automated driving systems.
Contribution
It proposes a new semantic region prediction method and an automatic labeling algorithm to enhance scene understanding for autonomous driving.
Findings
Achieves state-of-the-art results on HDD and nuScenes datasets.
Semantic region representations improve task performance.
Demonstrates the importance of high-level scene semantics in driving tasks.
Abstract
In this work, we tackle two vital tasks in automated driving systems, i.e., driver intent prediction and risk object identification from egocentric images. Mainly, we investigate the question: what would be good road scene-level representations for these two tasks? We contend that a scene-level representation must capture higher-level semantic and geometric representations of traffic scenes around ego-vehicle while performing actions to their destinations. To this end, we introduce the representation of semantic regions, which are areas where ego-vehicles visit while taking an afforded action (e.g., left-turn at 4-way intersections). We propose to learn scene-level representations via a novel semantic region prediction task and an automatic semantic region labeling algorithm. Extensive evaluations are conducted on the HDD and nuScenes datasets, and the learned representations lead to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Advanced Neural Network Applications · Human Pose and Action Recognition
