CARE Drive A Framework for Evaluating Reason-Responsiveness of Vision Language Models in Automated Driving

Lucas Elbert Suryana; Farah Bierenga; Sanne van Buuren; Pepijn Kooij; Elsefien Tulleners; Federico Scari; Simeon Calvert; Bart van Arem; Arkady Zgonnikov

arXiv:2602.15645·cs.AI·February 18, 2026

CARE Drive A Framework for Evaluating Reason-Responsiveness of Vision Language Models in Automated Driving

Lucas Elbert Suryana, Farah Bierenga, Sanne van Buuren, Pepijn Kooij, Elsefien Tulleners, Federico Scari, Simeon Calvert, Bart van Arem, Arkady Zgonnikov

PDF

Open Access

TL;DR

This paper introduces CARE Drive, a framework for evaluating whether vision language models in automated driving genuinely incorporate human reasons into their decision-making process, beyond just outcome accuracy.

Contribution

It presents a novel, model-agnostic evaluation framework that assesses reason responsiveness through controlled contextual variations in automated driving scenarios.

Findings

01

Explicit human reasons influence model decisions

02

Responsiveness varies across different contextual factors

03

Framework can evaluate reason responsiveness without model modification

Abstract

Foundation models, including vision language models, are increasingly used in automated driving to interpret scenes, recommend actions, and generate natural language explanations. However, existing evaluation methods primarily assess outcome based performance, such as safety and trajectory accuracy, without determining whether model decisions reflect human relevant considerations. As a result, it remains unclear whether explanations produced by such models correspond to genuine reason responsive decision making or merely post hoc rationalizations. This limitation is especially significant in safety critical domains because it can create false confidence. To address this gap, we propose CARE Drive, Context Aware Reasons Evaluation for Driving, a model agnostic framework for evaluating reason responsiveness in vision language models applied to automated driving. CARE Drive compares…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman-Automation Interaction and Safety · Multimodal Machine Learning Applications · Autonomous Vehicle Technology and Safety