Causal Direction of Data Collection Matters: Implications of Causal and   Anticausal Learning for NLP

Zhijing Jin; Julius von K\"ugelgen; Jingwei Ni; Tejas Vaidhya; Ayush; Kaushal; Mrinmaya Sachan; Bernhard Sch\"olkopf

arXiv:2110.03618·cs.CL·October 20, 2021

Causal Direction of Data Collection Matters: Implications of Causal and Anticausal Learning for NLP

Zhijing Jin, Julius von K\"ugelgen, Jingwei Ni, Tejas Vaidhya, Ayush, Kaushal, Mrinmaya Sachan, Bernhard Sch\"olkopf

PDF

Open Access 1 Repo

TL;DR

This paper explores how the causal direction in data collection impacts NLP tasks, revealing that understanding causal mechanisms can explain variations in semi-supervised learning and domain adaptation performance.

Contribution

It introduces the first analysis of the independent causal mechanisms principle in NLP, linking causal direction to empirical NLP outcomes and offering new insights for modeling.

Findings

01

Results align with causal theory predictions

02

Meta-analysis supports causal explanations for SSL and DA differences

03

Provides guidelines for future NLP modeling based on causal insights

Abstract

The principle of independent causal mechanisms (ICM) states that generative processes of real world data consist of independent modules which do not influence or inform each other. While this idea has led to fruitful developments in the field of causal inference, it is not widely-known in the NLP community. In this work, we argue that the causal direction of the data collection process bears nontrivial implications that can explain a number of published NLP findings, such as differences in semi-supervised learning (SSL) and domain adaptation (DA) performance across different settings. We categorize common NLP tasks according to their causal direction and empirically assay the validity of the ICM principle for text data using minimum description length. We conduct an extensive meta-analysis of over 100 published SSL and 30 DA studies, and find that the results are consistent with our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhijing-jin/icm4nlp
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks