
TL;DR
This paper discusses the broad and evolving nature of data science, emphasizing the need to identify core principles to better teach and develop the field.
Contribution
It proposes distilling core ideas from data science through the iterative analysis process to develop a unifying theory and improve education.
Findings
Identifies the need for a core set of principles in data science.
Suggests generalizations from past experience can form a theoretical basis.
Highlights the importance of focusing on the iterative analysis process.
Abstract
The field of data science currently enjoys a broad definition that includes a wide array of activities which borrow from many other established fields of study. Having such a vague characterization of a field in the early stages might be natural, but over time maintaining such a broad definition becomes unwieldy and impedes progress. In particular, the teaching of data science is hampered by the seeming need to cover many different points of interest. Data scientists must ultimately identify the core of the field by determining what makes the field unique and what it means to develop new knowledge in data science. In this review we attempt to distill some core ideas from data science by focusing on the iterative process of data analysis and develop some generalizations from past experience. Generalizations of this nature could form the basis of a theory of data science and would serve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
