Augmented Data Science: Towards Industrialization and Democratization of Data Science
Huseyin Uzunalioglu, Jin Cao, Chitra Phadke, Gerald Lehmann, Ahmet, Akyamac, Ran He, Jeongran Lee, and Maria Able

TL;DR
This paper introduces Augmented Data Science (ADS), a data-driven, automated approach that leverages ML and statistics to streamline data understanding and preparation, reducing manual effort and domain knowledge dependence.
Contribution
The paper presents a novel, domain-agnostic ADS framework that automates data exploration and enhances data scientist judgment through automatically-generated insights.
Findings
ADS effectively automates data exploration tasks.
ADS reduces manual effort in data preparation.
Case study demonstrates ADS's practical utility.
Abstract
Conversion of raw data into insights and knowledge requires substantial amounts of effort from data scientists. Despite breathtaking advances in Machine Learning (ML) and Artificial Intelligence (AI), data scientists still spend the majority of their effort in understanding and then preparing the raw data for ML/AI. The effort is often manual and ad hoc, and requires some level of domain knowledge. The complexity of the effort increases dramatically when data diversity, both in form and context, increases. In this paper, we introduce our solution, Augmented Data Science (ADS), towards addressing this "human bottleneck" in creating value from diverse datasets. ADS is a data-driven approach and relies on statistics and ML to extract insights from any data set in a domain-agnostic way to facilitate the data science process. Key features of ADS are the replacement of rudimentary data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Data Quality and Management · Semantic Web and Ontologies
