Comprehending Semantic Types in JSON Data with Graph Neural Networks
Shuang Wei, Michael J. Mior

TL;DR
This paper introduces a graph neural network approach to predict semantic types in JSON data by leveraging JSON Path structures, enhancing understanding of complex JSON documents for data processing tasks.
Contribution
The work extends semantic type prediction from relational to JSON data, utilizing graph neural networks to capture structural information in JSON collections.
Findings
Model outperforms existing state-of-the-art in several cases
Demonstrates effective understanding of complex JSON structures
Potential applications in JSON data processing tasks
Abstract
Semantic types are a more powerful and detailed way of describing data than atomic types such as strings or integers. They establish connections between columns and concepts from the real world, providing more nuanced and fine-grained information that can be useful for tasks such as automated data cleaning, schema matching, and data discovery. Existing deep learning models trained on large text corpora have been successful at performing single-column semantic type prediction for relational data. However, in this work, we propose an extension of the semantic type prediction problem to JSON data, labeling the types based on JSON Paths. Similar to columns in relational data, JSON Path is a query language that enables the navigation of complex JSON data structures by specifying the location and content of the elements. We use a graph neural network to comprehend the structural information…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Data Quality and Management · Natural Language Processing Techniques
