GLEN: General-Purpose Event Detection for Thousands of Types

Qiusi Zhan; Sha Li; Kathryn Conger; Martha Palmer; Heng Ji; Jiawei Han

arXiv:2303.09093·cs.CL·November 1, 2023·1 cites

GLEN: General-Purpose Event Detection for Thousands of Types

Qiusi Zhan, Sha Li, Kathryn Conger, Martha Palmer, Heng Ji, Jiawei Han

PDF

Open Access 1 Repo

TL;DR

This paper introduces GLEN, a large-scale, comprehensive event detection dataset with over 205,000 mentions across 3,465 types, and a novel multi-stage detection model CEDAR that outperforms baselines.

Contribution

The paper presents a new extensive dataset for event detection and a specialized model CEDAR designed to handle large ontologies, advancing the field's capabilities.

Findings

01

CEDAR outperforms baseline models including InstructGPT.

02

Label noise remains a significant challenge.

03

GLEN covers over 3,465 event types, vastly larger than existing datasets.

Abstract

The progress of event extraction research has been hindered by the absence of wide-coverage, large-scale datasets. To make event extraction systems more accessible, we build a general-purpose event detection dataset GLEN, which covers 205K event mentions with 3,465 different types, making it more than 20x larger in ontology than today's largest event dataset. GLEN is created by utilizing the DWD Overlay, which provides a mapping between Wikidata Qnodes and PropBank rolesets. This enables us to use the abundant existing annotation for PropBank as distant supervision. In addition, we also propose a new multi-stage event detection model CEDAR specifically designed to handle the large ontology size in GLEN. We show that our model exhibits superior performance compared to a range of baselines including InstructGPT. Finally, we perform error analysis and show that label noise is still the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zqs1943/glen
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Topic Modeling · Semantic Web and Ontologies

MethodsOntology