Semi-Supervised Event Extraction with Paraphrase Clusters
James Ferguson, Colin Lockard, Daniel S. Weld, Hannaneh Hajishirzi

TL;DR
This paper introduces a semi-supervised approach for event extraction that leverages paraphrase clusters across news articles to improve training data and enhance extraction accuracy.
Contribution
It proposes a self-training method using paraphrase clusters for bootstrapping additional training data in event extraction tasks.
Findings
Significant performance improvements on ACE 2005 dataset.
Effective use of paraphrase clusters for training data augmentation.
Enhanced accuracy of event extractors with limited labeled data.
Abstract
Supervised event extraction systems are limited in their accuracy due to the lack of available training data. We present a method for self-training event extraction systems by bootstrapping additional training data. This is done by taking advantage of the occurrence of multiple mentions of the same event instances across newswire articles from multiple sources. If our system can make a highconfidence extraction of some mentions in such a cluster, it can then acquire diverse training examples by adding the other mentions as well. Our experiments show significant performance improvements on multiple event extractors over ACE 2005 and TAC-KBP 2015 datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Natural Language Processing Techniques
