Knowledge Graph for Microdata of Statistics Netherlands

Chang Sun

arXiv:2101.07622·cs.DL·January 20, 2021·1 cites

Knowledge Graph for Microdata of Statistics Netherlands

Chang Sun

PDF

Open Access 1 Repo

TL;DR

This paper presents a knowledge graph that harmonizes and links CBS microdata metadata, enabling efficient querying and exploration of datasets for researchers, thereby enhancing data accessibility and usability.

Contribution

The project creates a comprehensive, multilingual knowledge graph of CBS microdata metadata using text mining and semantic web technologies, improving data discovery and integration.

Findings

01

Knowledge graph enables easy metadata querying.

02

Researchers can explore dataset relations efficiently.

03

Data discovery time and costs are significantly reduced.

Abstract

Statistics Netherlands (CBS) hosted a huge amount of data not only on the statistical level but also on the individual level. With the development of data science technologies, more and more researchers request to conduct their research by using high-quality individual data from CBS (called CBS Microdata) or combining them with other data sources. Making great use of these data for research and scientific purposes can tremendously benefit the whole society. However, CBS Microdata has been collected and maintained in different ways by different departments in and out of CBS. The representation, quality, metadata of datasets are not sufficiently harmonized. The project converts the descriptions of all CBS microdata sets into one knowledge graph with comprehensive metadata in Dutch and English using text mining and semantic web technologies. Researchers can easily query the metadata,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sunchang0124/KG-CBSMicrodata
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Geographic Information Systems Studies · Big Data Technologies and Applications