Non-redundant random generation algorithms for weighted context-free   languages

Andy Lorenz; Yann Ponty (LIX; INRIA Saclay - Ile de France)

arXiv:1211.0303·cs.FL·November 5, 2012

Non-redundant random generation algorithms for weighted context-free languages

Andy Lorenz, Yann Ponty (LIX, INRIA Saclay - Ile de France)

PDF

Open Access

TL;DR

This paper introduces efficient algorithms for non-redundant random generation of words in weighted context-free languages, improving over rejection-based methods with near-linear complexity after precomputation.

Contribution

It presents two novel algorithms based on recursive and unranking methods for non-redundant generation, with complexity significantly better than rejection-based approaches.

Findings

01

Algorithms achieve non-redundant generation in O(k·n·log n) operations

02

Precomputation of Θ(n) numbers enables efficient generation

03

Rejection-based approach has exponential worst-case complexity

Abstract

We address the non-redundant random generation of $k$ words of length $n$ in a context-free language. Additionally, we want to avoid a predefined set of words. We study a rejection-based approach, whose worst-case time complexity is shown to grow exponentially with $k$ for some specifications and in the limit case of a coupon collector. We propose two algorithms respectively based on the recursive method and on an unranking approach. We show how careful implementations of these algorithms allow for a non-redundant generation of $k$ words of length $n$ in $O (k \cdot n \cdot lo g n)$ arithmetic operations, after a precomputation of $Θ (n)$ numbers. The overall complexity is therefore dominated by the generation of $k$ words, and the non-redundancy comes at a negligible cost.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · semigroups and automata theory · DNA and Biological Computing