The Causal News Corpus: Annotating Causal Relations in Event Sentences from News
Fiona Anting Tan, Ali H\"urriyeto\u{g}lu, Tommaso Caselli, Nelleke, Oostdijk, Tadashi Nomoto, Hansi Hettiarachchi, Iqra Ameer, Onur Uca, Farhana, Ferdousi Liza, Tiancheng Hu

TL;DR
This paper introduces the Causal News Corpus, a new annotated dataset of news sentences for causal relation detection, and demonstrates its effectiveness for training neural models and transfer learning in causal text mining.
Contribution
It proposes a new annotation schema for event causality, creates a large annotated corpus, and evaluates neural models and transfer learning approaches for causal relation detection.
Findings
Neural network achieved 81.20% F1 on causal relation detection.
CNC is transferable to external corpora with up to 64% F1.
CNC serves as a valuable resource for causal text mining.
Abstract
Despite the importance of understanding causality, corpora addressing causal relations are limited. There is a discrepancy between existing annotation guidelines of event causality and conventional causality corpora that focus more on linguistics. Many guidelines restrict themselves to include only explicit relations or clause-based arguments. Therefore, we propose an annotation schema for event causality that addresses these concerns. We annotated 3,559 event sentences from protest event news with labels on whether it contains causal relations or not. Our corpus is known as the Causal News Corpus (CNC). A neural network built upon a state-of-the-art pre-trained language model performed well with 81.20% F1 score on test set, and 83.46% in 5-folds cross-validation. CNC is transferable across two external corpora: CausalTimeBank (CTB) and Penn Discourse Treebank (PDTB). Leveraging each of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
