On Causality in Domain Adaptation and Semi-Supervised Learning: an Information-Theoretic Analysis for Parametric Models
Xuetong Wu, Mingming Gong, Jonathan H. Manton, Uwe Aickelin, and Jingge Zhu

TL;DR
This paper provides an information-theoretic analysis of how causality affects the generalization performance in domain adaptation and semi-supervised learning, highlighting different sample complexity rates depending on causal direction.
Contribution
It introduces a formal theoretical framework distinguishing causal and anti-causal learning, explaining their different sample complexity dependencies in UDA and SSL.
Findings
In causal learning, excess risk depends on source sample size at rate O(1/m).
In anti-causal learning, unlabelled data dominate performance at rate O(1/n).
The results clarify the relationship between data size, causality, and learning difficulty.
Abstract
Recent advancements in unsupervised domain adaptation (UDA) and semi-supervised learning (SSL), particularly incorporating causality, have led to significant methodological improvements in these learning problems. However, a formal theory that explains the role of causality in the generalization performance of UDA/SSL is still lacking. In this paper, we consider the UDA/SSL scenarios where we access labelled source data and unlabelled target data as training instances under different causal settings with a parametric probabilistic model. We study the learning performance (e.g., excess risk) of prediction in the target domain from an information-theoretic perspective. Specifically, we distinguish two scenarios: the learning problem is called causal learning if the feature is the cause and the label is the effect, and is called anti-causal learning otherwise. We show that in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning
