Learning Schemas for Unordered XML

Radu Ciucanu; Slawek Staworko

arXiv:1307.6348·cs.DB·July 26, 2013·DBPL

Learning Schemas for Unordered XML

Radu Ciucanu, Slawek Staworko

PDF

TL;DR

This paper investigates the problem of learning unordered XML schemas, specifically disjunctive and disjunction-free multiplicity schemas, from positive and negative examples, providing algorithms with proven learnability and minimality guarantees.

Contribution

It introduces efficient algorithms for learning disjunctive and disjunction-free multiplicity schemas from examples, with formal proofs of learnability and minimality.

Findings

01

DMS are learnable from positive examples only.

02

MS are learnable from both positive and negative examples.

03

Algorithms produce minimal schemas consistent with the data.

Abstract

We consider unordered XML, where the relative order among siblings is ignored, and we investigate the problem of learning schemas from examples given by the user. We focus on the schema formalisms proposed in [10]: disjunctive multiplicity schemas (DMS) and its restriction, disjunction-free multiplicity schemas (MS). A learning algorithm takes as input a set of XML documents which must satisfy the schema (i.e., positive examples) and a set of XML documents which must not satisfy the schema (i.e., negative examples), and returns a schema consistent with the examples. We investigate a learning framework inspired by Gold [18], where a learning algorithm should be sound i.e., always return a schema consistent with the examples given by the user, and complete i.e., able to produce every schema with a sufficiently rich set of examples. Additionally, the algorithm should be efficient i.e.,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.