Validating XML Documents in the Streaming Model with External Memory
Christian Konrad, Frederic Magniez

TL;DR
This paper presents a streaming algorithm with external memory for validating large XML documents against DTDs efficiently, overcoming known space lower bounds, and introduces algorithms for encoding and decoding XML in streaming models.
Contribution
It introduces a deterministic streaming algorithm with external memory for XML validation, achieving logarithmic space complexity and multiple passes, and studies FCNS encoding/decoding in streaming.
Findings
Deterministic streaming algorithm with O(log^2 N) space for XML validation using external memory.
Efficient algorithms for encoding and decoding XML in streaming models.
Sublinear space algorithms for validating binary tree-encoded XML documents.
Abstract
We study the problem of validating XML documents of size against general DTDs in the context of streaming algorithms. The starting point of this work is a well-known space lower bound. There are XML documents and DTDs for which -pass streaming algorithms require space. We show that when allowing access to external memory, there is a deterministic streaming algorithm that solves this problem with memory space , a constant number of auxiliary read/write streams, and total number of passes on the XML document and auxiliary streams. An important intermediate step of this algorithm is the computation of the First-Child-Next-Sibling (FCNS) encoding of the initial XML document in a streaming fashion. We study this problem independently, and we also provide memory efficient streaming algorithms for decoding an XML document given in its FCNS…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplexity and Algorithms in Graphs · Algorithms and Data Compression · Advanced Data Storage Technologies
