NASA Science Mission Directorate Knowledge Graph Discovery
Roelien C. Timmer, Fech Scen Khoo, Megan Mark, Marcella Scoczynski, Ribeiro Martins, Anamaria Berea, Gregory Renard, Kaylin Bugbee

TL;DR
This paper presents a pipeline for generating knowledge graphs from NASA SMD data to facilitate dataset discovery and uncover cross-domain connections, leveraging NLP techniques.
Contribution
It introduces a novel pipeline for creating domain-specific knowledge graphs from NASA data, enhancing dataset search and discovery capabilities.
Findings
Created knowledge graphs for NASA SMD domains
Identified cross-domain connections in NASA data
Discussed NLP methods and challenges in KG generation
Abstract
The size of the National Aeronautics and Space Administration (NASA) Science Mission Directorate (SMD) is growing exponentially, allowing researchers to make discoveries. However, making discoveries is challenging and time-consuming due to the size of the data catalogs, and as many concepts and data are indirectly connected. This paper proposes a pipeline to generate knowledge graphs (KGs) representing different NASA SMD domains. These KGs can be used as the basis for dataset search engines, saving researchers time and supporting them in finding new connections. We collected textual data and used several modern natural language processing (NLP) methods to create the nodes and the edges of the KGs. We explore the cross-domain connections, discuss our challenges, and provide future directions to inspire researchers working on similar challenges.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Semantic Web and Ontologies · Big Data and Business Intelligence
