Non-redundant random generation algorithms for weighted context-free languages
Andy Lorenz, Yann Ponty (LIX, INRIA Saclay - Ile de France)

TL;DR
This paper introduces efficient algorithms for non-redundant random generation of words in weighted context-free languages, improving over rejection-based methods with near-linear complexity after precomputation.
Contribution
It presents two novel algorithms based on recursive and unranking methods for non-redundant generation, with complexity significantly better than rejection-based approaches.
Findings
Algorithms achieve non-redundant generation in O(k·n·log n) operations
Precomputation of Θ(n) numbers enables efficient generation
Rejection-based approach has exponential worst-case complexity
Abstract
We address the non-redundant random generation of words of length in a context-free language. Additionally, we want to avoid a predefined set of words. We study a rejection-based approach, whose worst-case time complexity is shown to grow exponentially with for some specifications and in the limit case of a coupon collector. We propose two algorithms respectively based on the recursive method and on an unranking approach. We show how careful implementations of these algorithms allow for a non-redundant generation of words of length in arithmetic operations, after a precomputation of numbers. The overall complexity is therefore dominated by the generation of words, and the non-redundancy comes at a negligible cost.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · semigroups and automata theory · DNA and Biological Computing
