Learning domain-invariant features through channel-level sparsification for Out-Of Distribution Generalization
Haoran Pei, Yuguang Yang, Kexin Liu, Juan Zhang, Baochang Zhang

TL;DR
This paper introduces Hierarchical Causal Dropout (HCD), a novel method that enforces feature sparsity at the channel level to improve out-of-distribution generalization by isolating causal features from spurious correlations.
Contribution
The paper proposes HCD, which uses channel-level causal masks and a mutual information objective to better separate causal from non-causal features for OOD robustness.
Findings
HCD outperforms existing methods on OOD benchmarks.
The approach effectively isolates causal features from spurious ones.
Experimental results demonstrate improved generalization across diverse datasets.
Abstract
Out-of-Distribution (OOD) generalization has become a primary metric for evaluating image analysis systems. Since deep learning models tend to capture domain-specific context, they often develop shortcut dependencies on these non-causal features, leading to inconsistent performance across different data sources. Current techniques, such as invariance learning, attempt to mitigate this. However, they struggle to isolate highly mixed features within deep latent spaces. This limitation prevents them from fully resolving the shortcut learning problem.In this paper, we propose Hierarchical Causal Dropout (HCD), a method that uses channel-level causal masks to enforce feature sparsity. This approach allows the model to separate causal features from spurious ones, effectively performing a causal intervention at the representation level. The training is guided by a Matrix-based Mutual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications
