Interactively Constructing Knowledge Graphs from Messy User-Generated Spreadsheets
Markus Schr\"oder, Christian Jilek, Michael Schulze, Andreas Dengel

TL;DR
This paper presents an interactive method for constructing knowledge graphs from messy, user-generated spreadsheets, enabling knowledge engineers to efficiently annotate data and build structured graphs, demonstrated on industrial spreadsheets.
Contribution
It introduces a graphical user interface for bulk annotation of spreadsheet cells, improving upon existing RDF mapping methods for messy data.
Findings
Built a 25,000-triple knowledge graph from five industrial spreadsheets.
Demonstrated improved efficiency over state-of-the-art RML methods.
Validated approach with real-world messy spreadsheet data.
Abstract
When spreadsheets are filled freely by knowledge workers, they can contain rather unstructured content. For humans and especially machines it becomes difficult to interpret such data properly. Therefore, spreadsheets are often converted to a more explicit, formal and structured form, for example, to a knowledge graph. However, if a data maintenance strategy has been missing and user-generated data becomes "messy", the construction of knowledge graphs will be a challenging task. In this paper, we catalog several of those challenges and propose an interactive approach to solve them. Our approach includes a graphical user interface which enables knowledge engineers to bulk-annotate spreadsheet cells with extracted information. Based on the cells' annotations a knowledge graph is ultimately formed. Using five spreadsheets from an industrial scenario, we built a 25k-triple graph during our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Context-Aware Activity Recognition Systems · Personal Information Management and User Behavior
