Debiasing Stance Detection Models with Counterfactual Reasoning and Adversarial Bias Learning
Jianhua Yuan, Yanyan Zhao, Bing Qin

TL;DR
This paper introduces a novel counterfactual inference framework combined with adversarial bias learning to effectively mitigate dataset bias in stance detection models, improving their understanding of text-target interactions.
Contribution
It proposes a new causal inference approach and an adversarial bias modeling technique to better disentangle bias features from true stance features in text data.
Findings
Outperforms existing debiasing methods on multiple test sets
Better models the interaction between texts and targets
Reduces reliance on dataset bias in stance detection
Abstract
Stance detection models may tend to rely on dataset bias in the text part as a shortcut and thus fail to sufficiently learn the interaction between the targets and texts. Recent debiasing methods usually treated features learned by small models or big models at earlier steps as bias features and proposed to exclude the branch learning those bias features during inference. However, most of these methods fail to disentangle the ``good'' stance features and ``bad'' bias features in the text part. In this paper, we investigate how to mitigate dataset bias in stance detection. Motivated by causal effects, we leverage a novel counterfactual inference framework, which enables us to capture the dataset bias in the text part as the direct causal effect of the text on stances and reduce the dataset bias in the text part by subtracting the direct text effect from the total causal effect. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Misinformation and Its Impacts · Adversarial Robustness in Machine Learning
Methodsfail · Test
