LEMONADE: A Large Multilingual Expert-Annotated Abstractive Event Dataset for the Real World
Sina J. Semnani, Pingyue Zhang, Wanyue Zhai, Haozhuo Li, Ryan Beauchamp, Trey Billing, Katayoun Kishi, Manling Li, Monica S. Lam

TL;DR
LEMONADE is a comprehensive multilingual dataset of conflict events with novel abstractive extraction methods, evaluated with large language models, highlighting current zero-shot limitations and future research directions.
Contribution
The paper introduces LEMONADE, a large-scale multilingual conflict event dataset, and proposes abstractive event extraction and entity linking methods evaluated with LLMs.
Findings
Zero-shot systems achieve 58.3% F1 in event extraction.
ZEST surpasses OneNet with 45.7% F1 in entity linking.
Zero-shot methods still lag behind supervised models by 20-37%.
Abstract
This paper presents LEMONADE, a large-scale conflict event dataset comprising 39,786 events across 20 languages and 171 countries, with extensive coverage of region-specific entities. LEMONADE is based on a partially reannotated subset of the Armed Conflict Location & Event Data (ACLED), which has documented global conflict events for over a decade. To address the challenge of aggregating multilingual sources for global event analysis, we introduce abstractive event extraction (AEE) and its subtask, abstractive entity linking (AEL). Unlike conventional span-based event extraction, our approach detects event arguments and entities through holistic document understanding and normalizes them across the multilingual dataset. We evaluate various large language models (LLMs) on these tasks, adapt existing zero-shot event extraction systems, and benchmark supervised models. Additionally, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Data Quality and Management · Advanced Text Analysis Techniques
