Uncovering Main Causalities for Long-tailed Information Extraction

Guoshun Nan; Jiaqi Zeng; Rui Qiao; Zhijiang Guo; Wei Lu

arXiv:2109.05213·cs.CL·September 14, 2021·1 cites

Uncovering Main Causalities for Long-tailed Information Extraction

Guoshun Nan, Jiaqi Zeng, Rui Qiao, Zhijiang Guo, Wei Lu

PDF

Open Access 1 Repo

TL;DR

This paper introduces CFIE, a causal inference framework for information extraction that mitigates dataset bias and spurious correlations by leveraging structural causal models and counterfactual reasoning, improving robustness across multiple tasks.

Contribution

The paper proposes a novel counterfactual IE framework using a structural causal model to uncover true causal relations and reduce bias in information extraction tasks.

Findings

01

CFIE effectively reduces spurious correlations.

02

Improves robustness of IE models across datasets.

03

Outperforms baseline methods in experiments.

Abstract

Information Extraction (IE) aims to extract structural information from unstructured texts. In practice, long-tailed distributions caused by the selection bias of a dataset, may lead to incorrect correlations, also known as spurious correlations, between entities and labels in the conventional likelihood models. This motivates us to propose counterfactual IE (CFIE), a novel framework that aims to uncover the main causalities behind data in the view of causal inference. Specifically, 1) we first introduce a unified structural causal model (SCM) for various IE tasks, describing the relationships among variables; 2) with our SCM, we then generate counterfactuals based on an explicit language structure to better calculate the direct causal effect during the inference stage; 3) we further propose a novel debiasing approach to yield more robust predictions. Experiments on three IE tasks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

heyyyyyyg/cfie
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsCounterfactuals Explanations