Stochastic Context-Free Grammars, Regular Languages, and Newton's Method
Kousha Etessami, Alistair Stewart, and Mihalis Yannakakis

TL;DR
This paper presents a polynomial-time algorithm for computing the probability that a stochastic context-free grammar generates a string in a given regular language, with applications in natural language processing and model checking.
Contribution
It introduces a polynomial-time method for probability computation in SCFGs intersected with regular languages, under mild assumptions satisfied by common learned models.
Findings
Probability computation is polynomial-time under mild assumptions.
Applicable to SCFGs learned via EM algorithm.
Supports efficient probabilistic analysis in NLP and model checking.
Abstract
We study the problem of computing the probability that a given stochastic context-free grammar (SCFG), G, generates a string in a given regular language L(D) (given by a DFA, D). This basic problem has a number of applications in statistical natural language processing, and it is also a key necessary step towards quantitative \omega-regular model checking of stochastic context-free processes (equivalently, 1-exit recursive Markov chains, or stateless probabilistic pushdown processes). We show that the probability that G generates a string in L(D) can be computed to within arbitrary desired precision in polynomial time (in the standard Turing model of computation), under a rather mild assumption about the SCFG, G, and with no extra assumption about D. We show that this assumption is satisfied for SCFG's whose rule probabilities are learned via the well-known inside-outside (EM)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · DNA and Biological Computing · Machine Learning and Algorithms
