1Cademy @ Causal News Corpus 2022: Leveraging Self-Training in Causality Classification of Socio-Political Event Data
Adam Nik, Ge Zhang, Xingran Chen, Mingyu Li, Jie Fu

TL;DR
This paper presents a self-training approach for causality classification in socio-political event data, demonstrating consistent performance improvements and robustness to data restrictions.
Contribution
It introduces a self-training pipeline using teacher-student classifiers for causality detection, showing its effectiveness in a shared task setting.
Findings
Self-training improves causality classification performance.
Performance remains stable even with restricted self-labeled data.
The approach is effective across multiple models and data conditions.
Abstract
This paper details our participation in the Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE) workshop @ EMNLP 2022, where we take part in Subtask 1 of Shared Task 3. We approach the given task of event causality detection by proposing a self-training pipeline that follows a teacher-student classifier method. More specifically, we initially train a teacher model on the true, original task data, and use that teacher model to self-label data to be used in the training of a separate student model for the final task prediction. We test how restricting the number of positive or negative self-labeled examples in the self-training process affects classification performance. Our final results show that using self-training produces a comprehensive performance improvement across all models and self-labeled training sets tested within the task of event…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Topic Modeling
MethodsTest
