RoadSceneBench: A Lightweight Benchmark for Mid-Level Road Scene Understanding
Xiyan Liu, Han Wang, Yuhu Wang, Junjie Cai, Zhe Cao, Jianzhong Yang, Zhen Lu

TL;DR
RoadSceneBench is a lightweight benchmark designed to evaluate and improve visual reasoning about road scenes, emphasizing relational understanding and structural consistency for autonomous driving applications.
Contribution
The paper introduces RoadSceneBench, a novel benchmark focusing on mid-level road scene reasoning, and proposes HRRP-T, a training framework enhancing spatial and semantic coherence in vision-language models.
Findings
Achieves state-of-the-art performance on diverse road configurations
Promotes geometry-aware and temporally consistent reasoning
Provides a compact, information-rich dataset for mid-level semantics
Abstract
Understanding mid-level road semantics, which capture the structural and contextual cues that link low-level perception to high-level planning, is essential for reliable autonomous driving and digital map construction. However, existing benchmarks primarily target perception tasks such as detection or segmentation, overlooking the reasoning capabilities required to infer road topology and dynamic scene structure. To address this gap, we present RoadSceneBench, a lightweight yet information-rich benchmark designed to evaluate and advance visual reasoning in complex road environments. Unlike large-scale perception datasets, RoadSceneBench emphasizes relational understanding and structural consistency, encouraging models to capture the underlying logic of real-world road scenes. Furthermore, to enhance reasoning reliability, we propose Hierarchical Relational Reward Propagation with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Advanced Neural Network Applications · Automated Road and Building Extraction
