The Principles of Data-Centric AI (DCAI)
Mohammad Hossein Jarrahi, Ali Memariani, Shion Guha

TL;DR
This paper introduces Data-Centric AI (DCAI), emphasizing the importance of data quality over models, and outlines six guiding principles to advance AI systems through a data-focused approach.
Contribution
It provides one of the first comprehensive overviews of DCAI, formulating six principles to guide research and practice in data-centric AI development.
Findings
Highlights the shift from model-centric to data-centric AI
Proposes six guiding principles for DCAI
Outlines future directions for DCAI research and practice
Abstract
Data is a crucial infrastructure to how artificial intelligence (AI) systems learn. However, these systems to date have been largely model-centric, putting a premium on the model at the expense of the data quality. Data quality issues beset the performance of AI systems, particularly in downstream deployments and in real-world applications. Data-centric AI (DCAI) as an emerging concept brings data, its quality and its dynamism to the forefront in considerations of AI systems through an iterative and systematic approach. As one of the first overviews, this article brings together data-centric perspectives and concepts to outline the foundations of DCAI. It specifically formulates six guiding principles for researchers and practitioners and gives direction for future advancement of DCAI.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Machine Learning and Data Classification · Anomaly Detection Techniques and Applications
