DataPro -- A Standardized Data Understanding and Processing Procedure: A Case Study of an Eco-Driving Project
Zhipeng Ma, Bo N{\o}rregaard J{\o}rgensen, Zheng Grace Ma

TL;DR
This paper introduces DataPro, an extension of the CRISP-DM framework, adding phases for technical understanding and implementation to improve data science projects, demonstrated through an eco-driving case study.
Contribution
The paper proposes DataPro, a standardized data processing procedure that enhances CRISP-DM by integrating stakeholder communication and practical implementation phases.
Findings
Effective alignment of business and technical goals.
Improved communication among data science team and stakeholders.
Successful application in eco-driving fuel efficiency project.
Abstract
A systematic pipeline for data processing and knowledge discovery is essential to extracting knowledge from big data and making recommendations for operational decision-making. The CRISP-DM model is the de-facto standard for developing data-mining projects in practice. However, advancements in data processing technologies require enhancements to this framework. This paper presents the DataPro (a standardized data understanding and processing procedure) model, which extends CRISP-DM and emphasizes the link between data scientists and stakeholders by adding the "technical understanding" and "implementation" phases. Firstly, the "technical understanding" phase aligns business demands with technical requirements, ensuring the technical team's accurate comprehension of business goals. Next, the "implementation" phase focuses on the practical application of developed data science models,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
