SCULPT: A Schema Language for Tabular Data on the Web
Wim Martens, Frank Neven, Stijn Vansummeren

TL;DR
SCULPT is a formal schema language designed for defining and constraining the structure of tabular web data, enabling efficient validation and streaming evaluation, with potential extensions for richer semantics and transformations.
Contribution
The paper introduces SCULPT, a novel schema language with formal semantics and efficient algorithms for tabular data validation and streaming, including region selection expressions.
Findings
Linear time evaluation algorithm for SCULPT
Support for weak and strong streaming evaluation
Extensions for types and complex content
Abstract
Inspired by the recent working effort towards a recommendation by the World Wide Web Consortium (W3C) for tabular data and metadata on the Web, we present in this paper a concept for a schema language for tabular web data called SCULPT. The language consists of rules constraining and defining the structure of regions in the table. These regions are defined through the novel formalism of region selection expressions. We present a formal model for SCULPT and obtain a linear time combined complexity evaluation algorithm. In addition, we consider weak and strong streaming evaluation for SCULPT and present a fragment for each of these streaming variants. Finally, we discuss several extensions of SCULPT including alternative semantics, types, complex content, and explore region selection expressions as a basis for a transformation language.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Advanced Database Systems and Queries · Web Data Mining and Analysis
