Big Data Quality: A systematic literature review and future research directions
Mostafa Mirzaie, Behshid Behkamal, Samad Paydar

TL;DR
This paper systematically reviews the state of the art in big data quality assessment over the past decade, proposing a hierarchical framework to categorize methods and identify research gaps.
Contribution
It introduces a comprehensive hierarchical framework for evaluating big data quality research and critically reviews existing methods within this structure.
Findings
Certain data quality assessment methods are widely adopted in the big data community.
Many existing methods have limitations in handling streaming and hybrid data processing.
Future research should focus on developing more versatile and scalable quality assessment techniques.
Abstract
One of the most significant problems of Big Data is to extract knowledge through the huge amount of data. The usefulness of the extracted information depends strongly on data quality. In addition to the importance, data quality has recently been taken into consideration by the big data community and there is not any comprehensive review conducted in this area. Therefore, the purpose of this study is to review and present the state of the art on the quality of big data research through a hierarchical framework. The dimensions of the proposed framework cover various aspects in the quality assessment of Big Data including 1) the processing types of big data, i.e. stream, batch, and hybrid, 2) the main task, and 3) the method used to conduct the task. We compare and critically review all of the studies reported during the last ten years through our proposed framework to identify which of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Big Data and Business Intelligence · Data Mining Algorithms and Applications
