Continuous Invariance Learning
Yong Lin, Fan Zhou, Lu Tan, Lintao Ma, Jiameng Liu, Yansu He, Yuan, Yuan, Yu Liu, James Zhang, Yujiu Yang, Hao Wang

TL;DR
This paper introduces Continuous Invariance Learning (CIL), a novel adversarial approach that effectively learns invariant features across continuous domains, addressing limitations of existing methods that discretize such domains.
Contribution
The paper proposes CIL, a new adversarial method for continuous invariance learning, with theoretical analysis and empirical validation showing its advantages over existing techniques.
Findings
CIL outperforms baseline methods on synthetic datasets.
CIL achieves better generalization on real-world production data.
Theoretical analysis confirms CIL's superiority over existing invariance methods.
Abstract
Invariance learning methods aim to learn invariant features in the hope that they generalize under distributional shifts. Although many tasks are naturally characterized by continuous domains, current invariance learning techniques generally assume categorically indexed domains. For example, auto-scaling in cloud computing often needs a CPU utilization prediction model that generalizes across different times (e.g., time of a day and date of a year), where `time' is a continuous domain index. In this paper, we start by theoretically showing that existing invariance learning methods can fail for continuous domain problems. Specifically, the naive solution of splitting continuous domains into discrete ones ignores the underlying relationship among domains, and therefore potentially leads to suboptimal performance. To address this challenge, we then propose Continuous Invariance Learning…
Peer Reviews
Decision·ICLR 2024 poster
investigating invariant learning for continuous domains is an interesting idea. This paper presents some innovative contributions and can extend the invariance learning among discrete domains to continuous domains, which has certain reference significance for other studies. As a theoretical rooted work, it first proves that the existing method fails in the continuous domain through theoretical derivation, and then proves the effectiveness of its own method. Also, the empirical studies verify the
Although the contributions of this work is worth noting, it still has some limitations in terms of problem definition and presentation. First is about the problem setting: in my view, when we talk about *continuous* in machine learning, it will reflect some time-series issues, i.e., the continual learning or lifelong learning framework. However, it seems in this work the notion of *continuous* is related to *many* domains. It’s more like we have several intermediate domains between two discrete
- The arguments in the paper flow well, and the problem formulation is interesting. This paper could be a benchmark (for datasets & settings) for future work exploring generalization over continuous domains. - Authors provide theoretical evidence to explain the failure of existing methods. This complements the empirical results that demonstrate the same. - The authors demonstrate the superior performance of their method across various real-world and toy datasets
- In almost all the datasets considered in the paper, the ground truth labels are independent of domains. However, it is possible to have domains where these are correlated --- different amounts of correlation in different domains. Why was such a toy environment not considered? It would be interesting to see how the proposed method performs in more correlated settings. - The proposed approach relies on classes being discrete, whereas the prior method relies on the domain being discrete. This li
1. The movation of extending invariant representation learning from discrete environment to continuous environment is nice and useful. 2. Applications on Alipay and Wilds-time demonstrate the practical usages of the new method.
Frankly speaking, I really like the motivation of this paper. However, I just feel the main strategy (especially the way to measure conditional independence) is not novel to me, which has been used in previous extensions of IRM, such as InvRat (Chang et al., 2020) and IIBNet (Li et al., 2022). Moreover, the proof to Proposition 2 seems largely rely on result of (Li & Liu, 2021), including their assumptions. 1. The estimation on the degree of independence between y and t given $\Phi(X)$ does not
Videos
Taxonomy
TopicsText and Document Classification Technologies
