CMNEE: A Large-Scale Document-Level Event Extraction Dataset based on Open-Source Chinese Military News
Mengna Zhu, Zijie Xu, Kaisheng Zeng, Kaiming Xiao, Mao Wang, Wenjun, Ke, Hongbin Huang

TL;DR
This paper introduces CMNEE, a large-scale, manually annotated dataset for document-level event extraction in Chinese military news, addressing data scarcity and highlighting domain-specific challenges.
Contribution
The creation of CMNEE, a comprehensive military news dataset with detailed annotations, and the systematic evaluation of state-of-the-art models on this domain.
Findings
Event extraction in military domain is more challenging than other domains.
Current models perform poorly on CMNEE, indicating need for further research.
The dataset facilitates future research in military event extraction.
Abstract
Extracting structured event knowledge, including event triggers and corresponding arguments, from military texts is fundamental to many applications, such as intelligence analysis and decision assistance. However, event extraction in the military field faces the data scarcity problem, which impedes the research of event extraction models in this domain. To alleviate this problem, we propose CMNEE, a large-scale, document-level open-source Chinese Military News Event Extraction dataset. It contains 17,000 documents and 29,223 events, which are all manually annotated based on a pre-defined schema for the military domain including 8 event types and 11 argument role types. We designed a two-stage, multi-turns annotation strategy to ensure the quality of CMNEE and reproduced several state-of-the-art event extraction models with a systematic evaluation. The experimental results on CMNEE fall…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Topic Modeling
