Learning the Semantics of Structured Data Sources
Mohsen Taheriyan, Craig A. Knoblock, Pedro Szekely, Jose Luis Ambite

TL;DR
This paper introduces an automated method to learn rich semantic models for structured data sources by leveraging domain ontologies and previously modeled sources, reducing manual effort.
Contribution
It proposes a novel approach that constructs and ranks candidate semantic models using domain knowledge and past models, incorporating user feedback for improved accuracy.
Findings
Generates expressive semantic models with minimal user input
Effectively leverages domain ontologies and prior models
Produces accurate semantic models for diverse data sources
Abstract
Information sources such as relational databases, spreadsheets, XML, JSON, and Web APIs contain a tremendous amount of structured data that can be leveraged to build and augment knowledge graphs. However, they rarely provide a semantic model to describe their contents. Semantic models of data sources represent the implicit meaning of the data by specifying the concepts and the relationships within the data. Such models are the key ingredients to automatically publish the data into knowledge graphs. Manually modeling the semantics of data sources requires significant effort and expertise, and although desirable, building these models automatically is a challenging problem. Most of the related work focuses on semantic annotation of the data fields (source attributes). However, constructing a semantic model that explicitly describes the relationships between the attributes in addition to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
