Counterfactual Variable Control for Robust and Interpretable Question Answering
Sicheng Yu, Yulei Niu, Shuohang Wang, Jing Jiang, Qianru Sun

TL;DR
This paper introduces Counterfactual Variable Control (CVC), a novel method that enhances the robustness and interpretability of question answering models by mitigating shortcut correlations through causal inference and multi-branch architecture.
Contribution
The paper proposes CVC, a new approach that explicitly reduces shortcut correlations in QA models, improving robustness and interpretability with novel inference methods.
Findings
CVC significantly improves robustness against adversarial attacks.
CVC maintains high interpretability in QA models.
Experiments on multiple benchmarks validate the effectiveness of CVC.
Abstract
Deep neural network based question answering (QA) models are neither robust nor explainable in many cases. For example, a multiple-choice QA model, tested without any input of question, is surprisingly "capable" to predict the most of correct options. In this paper, we inspect such spurious "capability" of QA models using causal inference. We find the crux is the shortcut correlation, e.g., unrobust word alignment between passage and options learned by the models. We propose a novel approach called Counterfactual Variable Control (CVC) that explicitly mitigates any shortcut correlation and preserves the comprehensive reasoning for robust QA. Specifically, we leverage multi-branch architecture that allows us to disentangle robust and shortcut correlations in the training process of QA. We then conduct two novel CVC inference methods (on trained models) to capture the effect of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
MethodsLinear Layer · WordPiece · Adam · Softmax · Multi-Head Attention · Layer Normalization · Dense Connections · Dropout · Linear Warmup With Linear Decay · Refunds@Expedia|||How do I get a full refund from Expedia?
