Learning the Semantics of Structured Data Sources

Mohsen Taheriyan; Craig A. Knoblock; Pedro Szekely; Jose Luis Ambite

arXiv:1601.04105·cs.AI·January 19, 2016

Learning the Semantics of Structured Data Sources

Mohsen Taheriyan, Craig A. Knoblock, Pedro Szekely, Jose Luis Ambite

PDF

TL;DR

This paper introduces an automated method to learn rich semantic models for structured data sources by leveraging domain ontologies and previously modeled sources, reducing manual effort.

Contribution

It proposes a novel approach that constructs and ranks candidate semantic models using domain knowledge and past models, incorporating user feedback for improved accuracy.

Findings

01

Generates expressive semantic models with minimal user input

02

Effectively leverages domain ontologies and prior models

03

Produces accurate semantic models for diverse data sources

Abstract

Information sources such as relational databases, spreadsheets, XML, JSON, and Web APIs contain a tremendous amount of structured data that can be leveraged to build and augment knowledge graphs. However, they rarely provide a semantic model to describe their contents. Semantic models of data sources represent the implicit meaning of the data by specifying the concepts and the relationships within the data. Such models are the key ingredients to automatically publish the data into knowledge graphs. Manually modeling the semantics of data sources requires significant effort and expertise, and although desirable, building these models automatically is a challenging problem. Most of the related work focuses on semantic annotation of the data fields (source attributes). However, constructing a semantic model that explicitly describes the relationships between the attributes in addition to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.