Open-Vocabulary Domain Generalization in Urban-Scene Segmentation

Dong Zhao; Qi Zang; Nan Pu; Wenjing Li; Nicu Sebe; Zhun Zhong

arXiv:2602.18853·cs.CV·March 10, 2026

Open-Vocabulary Domain Generalization in Urban-Scene Segmentation

Dong Zhao, Qi Zang, Nan Pu, Wenjing Li, Nicu Sebe, Zhun Zhong

PDF

Open Access

TL;DR

This paper introduces a new benchmark and method for open-vocabulary domain generalization in urban-scene segmentation, addressing the challenge of recognizing unseen categories across diverse unseen environments.

Contribution

It proposes the first benchmark for OVDG-SS in autonomous driving and introduces S2-Corr, a novel correlation refinement mechanism to improve robustness across domains.

Findings

01

S2-Corr improves cross-domain segmentation accuracy.

02

The benchmark covers synthetic-to-real and real-to-real generalization.

03

The proposed method outperforms existing approaches in efficiency and accuracy.

Abstract

Domain Generalization in Semantic Segmentation (DG-SS) aims to enable segmentation models to perform robustly in unseen environments. However, conventional DG-SS methods are restricted to a fixed set of known categories, limiting their applicability in open-world scenarios. Recent progress in Vision-Language Models (VLMs) has advanced Open-Vocabulary Semantic Segmentation (OV-SS) by enabling models to recognize a broader range of concepts. Yet, these models remain sensitive to domain shifts and struggle to maintain robustness when deployed in unseen environments, a challenge that is particularly severe in urban-driving scenarios. To bridge this gap, we introduce Open-Vocabulary Domain Generalization in Semantic Segmentation (OVDG-SS), a new setting that jointly addresses unseen domains and unseen categories. We introduce the first benchmark for OVDG-SS in autonomous driving, addressing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications