An Image-based Approach of Task-driven Driving Scene Categorization
Shaochi Hu, Hanwei Fan, Biao Gao, XijunZhao, Huijing Zhao

TL;DR
This paper introduces a novel weakly supervised, task-driven approach for categorizing complex driving scenes from videos, emphasizing semantic attributes over object detection, with high accuracy demonstrated in campus environments.
Contribution
It proposes a contrastive learning-based method for scene categorization that relies on human decision points rather than detailed semantic labels, improving scene understanding for autonomous driving.
Findings
Achieved 97.17% accuracy on training video
Achieved 85.44% accuracy on new scene videos
Effectively discriminates scene attributes in cluttered environments
Abstract
Categorizing driving scenes via visual perception is a key technology for safe driving and the downstream tasks of autonomous vehicles. Traditional methods infer scene category by detecting scene-related objects or using a classifier that is trained on large datasets of fine-labeled scene images. Whereas at cluttered dynamic scenes such as campus or park, human activities are not strongly confined by rules, and the functional attributes of places are not strongly correlated with objects. So how to define, model and infer scene categories is crucial to make the technique really helpful in assisting a robot to pass through the scene. This paper proposes a method of task-driven driving scene categorization using weakly supervised data. Given a front-view video of a driving scene, a set of anchor points is marked by following the decision making of a human driver, where an anchor…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
