Operating Within the Operational Design Domain: Zero-Shot Perception with Vision-Language Models
Berkehan \"Unal, Hauke Dierend, Dren Fazlija, Christopher Plachetka

TL;DR
This paper explores the use of vision-language models as zero-shot perception tools for defining and detecting operational domains in autonomous driving, emphasizing safety and adaptability.
Contribution
It provides an empirical evaluation of VLMs for zero-shot ODD classification, introduces optimization strategies, and offers prompting templates for adaptable perception.
Findings
Chain-of-thought prompting with personas yields best performance.
Other prompting methods may reduce recall.
Results support transparent ODD perception in safety-critical systems.
Abstract
Over the last few years, research on autonomous systems has matured to such a degree that the field is increasingly well-positioned to translate research into practical, stakeholder-driven use cases across well-defined domains. However, for a wide-scale practical adoption of autonomous systems, adherence to safety regulations is crucial. Many regulations are influenced by the Operational Design Domain (ODD), which defines the specific conditions in which an autonomous agent can function. This is especially relevant for Automated Driving Systems (ADS), as a dependable perception of ODD elements is essential for safe implementation and auditing. Vision-language models (VLMs) integrate visual recognition and language reasoning, functioning without task-specific training data, which makes them suitable for adaptable ODD perception. To assess whether VLMs can function as zero-shot "ODD…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
