A Vision for Semantically Enriched Data Science
Udayan Khurana, Kavitha Srinivas, Sainyam Galhotra, Horst Samulowitz

TL;DR
This paper advocates for integrating semantic understanding into data science to enhance automation, explainability, and trust, addressing current gaps in leveraging domain knowledge and data semantics.
Contribution
It proposes a vision for incorporating semantic reasoning and annotation into data science workflows to improve automation, data understanding, and addressing trust and bias issues.
Findings
Highlights shortcomings of current data science methods.
Envisions semantic tools for data augmentation and transformation.
Discusses benefits for trust, bias, and explainability.
Abstract
The recent efforts in automation of machine learning or data science has achieved success in various tasks such as hyper-parameter optimization or model selection. However, key areas such as utilizing domain knowledge and data semantics are areas where we have seen little automation. Data Scientists have long leveraged common sense reasoning and domain knowledge to understand and enrich data for building predictive models. In this paper we discuss important shortcomings of current data science and machine learning solutions. We then envision how leveraging "semantic" understanding and reasoning on data in combination with novel tools for data science automation can help with consistent and explainable data augmentation and transformation. Additionally, we discuss how semantics can assist data scientists in a new manner by helping with challenges related to trust, bias, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Semantic Web and Ontologies · Data Quality and Management
