SG-CADVLM: A Context-Aware Decoding Powered Vision Language Model for Safety-Critical Scenario Generation

Hongyi Zhao; Shuo Wang; Qijie He; Ziyuan Pu

arXiv:2601.18442·cs.RO·May 19, 2026

SG-CADVLM: A Context-Aware Decoding Powered Vision Language Model for Safety-Critical Scenario Generation

Hongyi Zhao, Shuo Wang, Qijie He, Ziyuan Pu

PDF

TL;DR

SG-CADVLM is a novel framework that uses context-aware decoding and multimodal inputs to generate high-fidelity safety-critical scenarios for autonomous vehicle testing, significantly outperforming baseline methods.

Contribution

The paper introduces SG-CADVLM, a framework that effectively integrates context-aware decoding with multimodal processing to improve safety-critical scenario generation from crash reports.

Findings

01

Achieves 88.1% rate of generating critical scenarios, compared to 31.2% for baselines.

02

Mitigates hallucination in vision-language models during scenario generation.

03

Produces executable simulations for autonomous vehicle safety validation.

Abstract

Autonomous Vehicle (AV) requires rigorous testing in safety-critical scenarios for safety validation, yet its validation is hindered by the high cost of field testing and the lack of fidelity in current simulations for rare safety-critical events. Crash reports offer rich and authentic specifications of real-world accident dynamics, making them a promising resource for Large Language Models and Vision-Language models to generate high-fidelity scenarios. However, the existing models frequently deviate from actual accident characteristics due to context suppression. To address these limitations, this paper presents SG-CADVLM, a framework integrateing Context-Aware Decoding with multimodal input processing to generate safety-critical scenarios from crash reports. The framework mitigates the hallucination of VLMs while generating road geometry and vehicle trajectories simultaneously. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Multimodal Machine Learning Applications · Adversarial Robustness in Machine Learning