Shacl4Bib: custom validation of library data
P\'eter Kir\'aly

TL;DR
Shacl4Bib extends SHACL-like validation to non-RDF formats like XML, CSV, and JSON, enabling libraries to define and reuse data validation criteria in a clear, unified language.
Contribution
It introduces a framework for custom data validation across diverse formats using SHACL-like rules, enhancing data quality assessment.
Findings
Supports validation for XML, CSV, JSON, MARC21, UNIMARC, PICA formats
Allows criteria definition via YAML, JSON, or Java code
Improves clarity and reusability of validation processes
Abstract
The Shapes Constraint Language (SHACL) is a formal language for validating RDF graphs against a set of conditions. Following this idea and implementing a subset of the language, the Metadata Quality Assessment Framework provides Shacl4Bib: a mechanism to define SHACL-like rules for data sources in non-RDF based formats, such as XML, CSV and JSON. QA catalogue extends this concept further to MARC21, UNIMARC and PICA data. The criteria can be defined either with YAML or JSON configuration files or with Java code. Libraries can validate their data against criteria expressed in a unified language, that improves the clarity and the reusability of custom validation processes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing
