Learning XML Twig Queries
S{\l}awomir Staworko, Piotr Wieczorek

TL;DR
This paper explores algorithms for learning XML queries from user-provided examples, focusing on path and tree pattern queries, and analyzes the conditions under which such learning is feasible or impossible.
Contribution
It formalizes the learnability of certain XML query classes from examples and identifies limitations when negative examples are included.
Findings
Path queries and path-subsumption-free tree queries are learnable from positive examples.
Adding negative examples makes learning unfeasible.
Open question on the learnability of full classes of tree pattern and path queries.
Abstract
We investigate the problem of learning XML queries, path queries and tree pattern queries, from examples given by the user. A learning algorithm takes on the input a set of XML documents with nodes annotated by the user and returns a query that selects the nodes in a manner consistent with the annotation. We study two learning settings that differ with the types of annotations. In the first setting the user may only indicate required nodes that the query must return. In the second, more general, setting, the user may also indicate forbidden nodes that the query must not return. The query may or may not return any node with no annotation. We formalize what it means for a class of queries to be \emph{learnable}. One requirement is the existence of a learning algorithm that is sound i.e., always returns a query consistent with the examples given by the user. Furthermore, the learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Algorithms and Data Compression · semigroups and automata theory
