Defining data science: a new field of inquiry
Michael L Brodie

TL;DR
This paper argues that data science is a research paradigm, not a science, and proposes developing a unified, coherent definition based on a reference framework to better understand and advance the field.
Contribution
It introduces a reference framework and candidate definitions to unify the diverse and inconsistent definitions of data science, addressing its conceptual challenges.
Findings
Proposes a six-component data science reference framework
Suggests a unified definition can improve understanding and development
Highlights the need for a community-driven consensus process
Abstract
Data science is not a science. It is a research paradigm. Its power, scope, and scale will surpass science, our most powerful research paradigm, to enable knowledge discovery and change our world. We have yet to understand and define it, vital to realizing its potential and managing its risks. Modern data science is in its infancy. Emerging slowly since 1962 and rapidly since 2000, it is a fundamentally new field of inquiry, one of the most active, powerful, and rapidly evolving 21st century innovations. Due to its value, power, and applicability, it is emerging in over 40 disciplines, hundreds of research areas, and thousands of applications. Millions of data science publications contain myriad definitions of data science and data science problem solving. Due to its infancy, many definitions are independent, application specific, mutually incomplete, redundant, or inconsistent, hence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Business Intelligence
