Causal-Tune: Mining Causal Factors from Vision Foundation Models for Domain Generalized Semantic Segmentation

Yin Zhang; Yongqiang Zhang; Yaoyue Zheng; Bogdan Raducanu; Dan Liu

arXiv:2512.16567·cs.CV·December 19, 2025

Causal-Tune: Mining Causal Factors from Vision Foundation Models for Domain Generalized Semantic Segmentation

Yin Zhang, Yongqiang Zhang, Yaoyue Zheng, Bogdan Raducanu, Dan Liu

PDF

Open Access 1 Video

TL;DR

Causal-Tune is a novel fine-tuning approach for vision models that disentangles causal from non-causal factors in features, enhancing domain generalization in semantic segmentation, especially under adverse weather conditions.

Contribution

It introduces a frequency domain disentanglement method using DCT and causal-aware tokens to improve domain robustness of vision foundation models.

Findings

01

Achieves +4.8% mIoU improvement in snow conditions

02

Effectively separates causal and non-causal spectral components

03

Enhances robustness of semantic segmentation across domains

Abstract

Fine-tuning Vision Foundation Models (VFMs) with a small number of parameters has shown remarkable performance in Domain Generalized Semantic Segmentation (DGSS). Most existing works either train lightweight adapters or refine intermediate features to achieve better generalization on unseen domains. However, they both overlook the fact that long-term pre-trained VFMs often exhibit artifacts, which hinder the utilization of valuable representations and ultimately degrade DGSS performance. Inspired by causal mechanisms, we observe that these artifacts are associated with non-causal factors, which usually reside in the low- and high-frequency components of the VFM spectrum. In this paper, we explicitly examine the causal and non-causal factors of features within VFMs for DGSS, and propose a simple yet effective method to identify and disentangle them, enabling more robust domain…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Causal-Tune: Mining Causal Factors from Vision Foundation Models for Domain Generalized Semantic Segmentation· underline

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications