Towards Semantically Enhanced Data Understanding
Markus Schr\"oder, Christian Jilek, J\"orn Hees, Andreas, Dengel

TL;DR
This paper proposes a semantic model linking data with its documentation to improve data understanding and reduce lookup overhead, demonstrated through an early prototype.
Contribution
It introduces a unified semantic model that interlinks data and documentation, enabling easier access, browsing, and supporting various data analysis tasks.
Findings
Semantic model effectively links data with documentation
Prototype demonstrates improved data understanding workflow
Supports searching, comparing, and visualizing data
Abstract
In the field of machine learning, data understanding is the practice of getting initial insights in unknown datasets. Such knowledge-intensive tasks require a lot of documentation, which is necessary for data scientists to grasp the meaning of the data. Usually, documentation is separate from the data in various external documents, diagrams, spreadsheets and tools which causes considerable look up overhead. Moreover, other supporting applications are not able to consume and utilize such unstructured data. That is why we propose a methodology that uses a single semantic model that interlinks data with its documentation. Hence, data scientists are able to directly look up the connected information about the data by simply following links. Equally, they can browse the documentation which always refers to the data. Furthermore, the model can be used by other approaches providing additional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Data Quality and Management · Advanced Database Systems and Queries
