Counterfactual Variable Control for Robust and Interpretable Question   Answering

Sicheng Yu; Yulei Niu; Shuohang Wang; Jing Jiang; Qianru Sun

arXiv:2010.05581·cs.CL·October 13, 2020·6 cites

Counterfactual Variable Control for Robust and Interpretable Question Answering

Sicheng Yu, Yulei Niu, Shuohang Wang, Jing Jiang, Qianru Sun

PDF

Open Access 1 Repo

TL;DR

This paper introduces Counterfactual Variable Control (CVC), a novel method that enhances the robustness and interpretability of question answering models by mitigating shortcut correlations through causal inference and multi-branch architecture.

Contribution

The paper proposes CVC, a new approach that explicitly reduces shortcut correlations in QA models, improving robustness and interpretability with novel inference methods.

Findings

01

CVC significantly improves robustness against adversarial attacks.

02

CVC maintains high interpretability in QA models.

03

Experiments on multiple benchmarks validate the effectiveness of CVC.

Abstract

Deep neural network based question answering (QA) models are neither robust nor explainable in many cases. For example, a multiple-choice QA model, tested without any input of question, is surprisingly "capable" to predict the most of correct options. In this paper, we inspect such spurious "capability" of QA models using causal inference. We find the crux is the shortcut correlation, e.g., unrobust word alignment between passage and options learned by the models. We propose a novel approach called Counterfactual Variable Control (CVC) that explicitly mitigates any shortcut correlation and preserves the comprehensive reasoning for robust QA. Specifically, we leverage multi-branch architecture that allows us to disentangle robust and shortcut correlations in the training process of QA. We then conduct two novel CVC inference methods (on trained models) to capture the effect of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

PluviophileYU/CVC-QA
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning

MethodsLinear Layer · WordPiece · Adam · Softmax · Multi-Head Attention · Layer Normalization · Dense Connections · Dropout · Linear Warmup With Linear Decay · Refunds@Expedia|||How do I get a full refund from Expedia?