DRIV-EX: Counterfactual Explanations for Driving LLMs

Amaia Cardiel; Eloi Zablocki; Elias Ramzi; Eric Gaussier

arXiv:2603.00696·cs.CL·April 23, 2026

DRIV-EX: Counterfactual Explanations for Driving LLMs

Amaia Cardiel, Eloi Zablocki, Elias Ramzi, Eric Gaussier

PDF

1 Repo

TL;DR

DRIV-EX introduces a gradient-based method for generating fluent, valid counterfactual explanations of LLM decisions in autonomous driving, revealing biases and enhancing interpretability.

Contribution

It presents a novel approach combining gradient optimization and controlled decoding to produce coherent counterfactual scene descriptions for driving LLMs.

Findings

01

DRIV-EX generates more reliable counterfactuals than existing methods.

02

It exposes latent biases in LLM-based driving models.

03

The approach improves interpretability and robustness of autonomous driving decisions.

Abstract

Large language models (LLMs) are increasingly used as reasoning engines in autonomous driving, yet their decision-making remains opaque. We propose to study their decision process through counterfactual explanations, which identify the minimal semantic changes to a scene description required to alter a driving plan. We introduce DRIV-EX, a method that leverages gradient-based optimization on continuous embeddings to identify the input shifts required to flip the model's decision. Crucially, to avoid the incoherent text typical of unconstrained continuous optimization, DRIV-EX uses these optimized embeddings solely as a semantic guide: they are used to bias a controlled decoding process that re-generates the original scene description. This approach effectively steers the generation toward the counterfactual target while guaranteeing the linguistic fluency, domain validity, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Amaia-CARDIEL/DRIV_EX
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.