Schema-Based Automata Determinization
Joachim Niehren (Inria, Universit\'e de Lille, France), Momar Sakho, (Inria, Universit\'e de Lille, France), Antonio Al Serhali (Inria,, Universit\'e de Lille, France)

TL;DR
This paper introduces a schema-based determinization algorithm for finite automata and hedge automata, improving efficiency and memory usage over standard methods, with practical application to XPath queries.
Contribution
The paper presents a novel schema-based determinization algorithm that integrates cleaning into automata determinization, enhancing efficiency and scalability.
Findings
The new algorithm is more efficient than standard determinization plus cleaning.
It produces smaller deterministic automata for complex XPath queries.
Implementation demonstrates practical memory savings and improved automata size.
Abstract
We propose an algorithm for schema-based determinization of finite automata on words and of step-wise hedge automata on nested words. The idea is to integrate schema-based cleaning directly into automata determinization. We prove the correctness of our new algorithm and show that it is alway smore efficient than standard determinization followed by schema-based cleaning. Our implementation permits to obtain a small deterministic automaton for an example of an XPath query, where standard determinization yields a huge stepwise hedge automaton for which schema-based cleaning runs out of memory.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Packet Processing and Optimization · semigroups and automata theory · DNA and Biological Computing
