Event Data Quality: A Survey
Ruihong Huang, Jianmin Wang

TL;DR
This survey reviews various research efforts addressing data quality issues in event data across domains like finance, IoT, and web services, focusing on matching, error detection, repair, and pattern matching.
Contribution
It provides a comprehensive summary of existing methods tackling event data quality issues, highlighting key techniques and challenges in the field.
Findings
Event data quality issues are prevalent across multiple domains.
Various methods exist for event matching, error detection, and data repair.
Research in event data quality is evolving to handle heterogeneous and dirty data sources.
Abstract
Event data are prevalent in diverse domains such as financial trading, business workflows and industrial IoT nowadays. An event is often characterized by several attributes denoting the meaning associated with the corresponding occurrence time/duration. From traditional operational systems in enterprises to online systems for Web services, event data is generated from physical world uninterruptedly. However, due to the variety and veracity features of Big data, event data generated from heterogeneous and dirty sources could have very different event representations and data quality issues. In this work, we summarize several typical works on studying data quality issues of event data, including: (1) event matching, (2) event error detection, (3) event data repair, and (4) approximate pattern matching.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Advanced Database Systems and Queries · Semantic Web and Ontologies
