MUSIED: A Benchmark for Event Detection from Multi-Source Heterogeneous Informal Texts
Xiangyu Xi, Jianwei Lv, Shuaipeng Liu, Wei Ye, Fan Yang, Guanglu, Wan

TL;DR
This paper introduces MUSIED, a large-scale Chinese event detection benchmark from heterogeneous informal texts, highlighting challenges and providing a new dataset from user reviews, conversations, and calls in e-commerce.
Contribution
It presents the first large-scale dataset for event detection in informal, multi-source Chinese texts, expanding research beyond formal texts and addressing heterogeneity and informality.
Findings
State-of-the-art methods struggle with informal, heterogeneous texts
The dataset reveals significant challenges in event detection from informal sources
Further research is needed to improve event detection in real-world, multi-source informal data
Abstract
Event detection (ED) identifies and classifies event triggers from unstructured texts, serving as a fundamental task for information extraction. Despite the remarkable progress achieved in the past several years, most research efforts focus on detecting events from formal texts (e.g., news articles, Wikipedia documents, financial announcements). Moreover, the texts in each dataset are either from a single source or multiple yet relatively homogeneous sources. With massive amounts of user-generated text accumulating on the Web and inside enterprises, identifying meaningful events in these informal texts, usually from multiple heterogeneous sources, has become a problem of significant practical value. As a pioneering exploration that expands event detection to the scenarios involving informal and heterogeneous texts, we propose a new large-scale Chinese event detection dataset based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Web Data Mining and Analysis · Topic Modeling
Methodstravel james
