Formal Properties of XML Grammars and Languages
Jean Berstel, Luc Boasson

TL;DR
This paper explores the formal properties of XML grammars, showing their uniqueness, characterizations, and decidability aspects, including conditions under which they generate regular XML-languages.
Contribution
It introduces formal characterizations of XML-grammars, proves their near-uniqueness, and demonstrates decidability of certain properties specific to XML-languages.
Findings
XML-languages have essentially unique XML-grammars
Decidability of certain properties improves for XML-languages
Characterization of XML-grammars generating regular languages
Abstract
XML documents are described by a document type definition (DTD). An XML-grammar is a formal grammar that captures the syntactic features of a DTD. We investigate properties of this family of grammars. We show that every XML-language basically has a unique XML-grammar. We give two characterizations of languages generated by XML-grammars, one is set-theoretic, the other is by a kind of saturation property. We investigate decidability problems and prove that some properties that are undecidable for general context-free languages become decidable for XML-languages. We also characterize those XML-grammars that generate regular XML-languages.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicssemigroups and automata theory · DNA and Biological Computing · Formal Methods in Verification
