Prefix Probabilities from Stochastic Tree Adjoining Grammars

Mark-Jan Nederhof (DFKI); Anoop Sarkar (UPenn); Giorgio Satta; (UPadova)

arXiv:cs/9809026·cs.CL·May 23, 2007

Prefix Probabilities from Stochastic Tree Adjoining Grammars

Mark-Jan Nederhof (DFKI), Anoop Sarkar (UPenn), Giorgio Satta, (UPadova)

PDF

Open Access

TL;DR

This paper presents an algorithm to compute prefix probabilities from stochastic Tree Adjoining Grammars, enabling their use in language modeling for speech recognition with efficient computation.

Contribution

It introduces a novel O(n^6) algorithm for prefix probability calculation from stochastic TAGs, bridging structural grammar models and probabilistic language modeling.

Findings

01

Algorithm computes prefix probabilities in O(n^6) time.

02

Enables stochastic TAGs to be used for language modeling.

03

Precomputes subderivation probabilities for structural contributions.

Abstract

Language models for speech recognition typically use a probability model of the form Pr(a_n | a_1, a_2, ..., a_{n-1}). Stochastic grammars, on the other hand, are typically used to assign structure to utterances. A language model of the above form is constructed from such grammars by computing the prefix probability Sum_{w in Sigma*} Pr(a_1 ... a_n w), where w represents all possible terminations of the prefix a_1 ... a_n. The main result in this paper is an algorithm to compute such prefix probabilities given a stochastic Tree Adjoining Grammar (TAG). The algorithm achieves the required computation in O(n^6) time. The probability of subderivations that do not derive any words in the prefix, but contribute structurally to its derivation, are precomputed to achieve termination. This algorithm enables existing corpus-based estimation techniques for stochastic TAGs to be used for language…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Topic Modeling