Counterfactual Adversarial Learning with Representation Interpolation
Wei Wang, Boxin Wang, Ning Shi, Jinfeng Li, Bingyu Zhu, Xiangyu Liu,, Rong Zhang

TL;DR
This paper introduces a causality-inspired adversarial training framework that generates counterfactual representations to improve model robustness and performance across various NLP tasks.
Contribution
It proposes the Counterfactual Adversarial Training (CAT) framework that uses latent space interpolation and counterfactual risk minimization to enhance causal learning in deep models.
Findings
CAT outperforms state-of-the-art methods on multiple NLP tasks
Counterfactual representations improve model robustness
Dynamic sample-wise loss weighting enhances causal effect exploration
Abstract
Deep learning models exhibit a preference for statistical fitting over logical reasoning. Spurious correlations might be memorized when there exists statistical bias in training data, which severely limits the model performance especially in small data scenarios. In this work, we introduce Counterfactual Adversarial Training framework (CAT) to tackle the problem from a causality perspective. Particularly, for a specific sample, CAT first generates a counterfactual representation through latent space interpolation in an adversarial manner, and then performs Counterfactual Risk Minimization (CRM) on each original-counterfactual pair to adjust sample-wise loss weight dynamically, which encourages the model to explore the true causal effect. Extensive experiments demonstrate that CAT achieves substantial performance improvement over SOTA across different downstream tasks, including sentence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)
