CARScenes: Semantic VLM Dataset for Safe Autonomous Driving
Yuankai He, Weisong Shi

TL;DR
CARScenes is a comprehensive, annotated dataset for autonomous driving that enables training and evaluating vision-language models for interpretable scene understanding, including attributes, co-occurrences, and risk assessment.
Contribution
The paper introduces CARScenes, a large-scale, richly annotated dataset with GPT-4o-assisted labeling, supporting semantic retrieval and scenario analysis for autonomous driving.
Findings
High-quality annotations for 5,192 images from multiple datasets
Baseline models evaluated with scalar accuracy and F1 metrics
Resource release for future research in explainable autonomous systems
Abstract
CAR-Scenes is a frame-level dataset for autonomous driving that enables training and evaluation of vision-language models (VLMs) for interpretable, scene-level understanding. We annotate 5,192 images drawn from Argoverse 1, Cityscapes, KITTI, and nuScenes using a 28-key category/sub-category knowledge base covering environment, road geometry, background-vehicle behavior, ego-vehicle behavior, vulnerable road users, sensor states, and a discrete severity scale (1-10), totaling 350+ leaf attributes. Labels are produced by a GPT-4o-assisted vision-language pipeline with human-in-the-loop verification; we release the exact prompts, post-processing rules, and per-field baseline model performance. CAR-Scenes also provides attribute co-occurrence graphs and JSONL records that support semantic retrieval, dataset triage, and risk-aware scenario mining across sources. To calibrate task…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications
