Probabilistic top-down parsing and language modeling

Brian Roark

arXiv:cs/0105016·cs.CL·May 23, 2007

Probabilistic top-down parsing and language modeling

Brian Roark

PDF

Open Access

TL;DR

This paper introduces a probabilistic top-down parser for language modeling in speech recognition, demonstrating improved accuracy and perplexity over previous models, and showing the benefits of combining it with traditional trigram models.

Contribution

It presents a lexicalized probabilistic top-down parser that outperforms existing broad-coverage parsers and develops a new language model that significantly improves speech recognition performance.

Findings

01

Parser achieves high accuracy and efficiency.

02

Language model reduces perplexity compared to previous models.

03

Interpolation with trigram models yields substantial improvements.

Abstract

This paper describes the functioning of a broad-coverage probabilistic top-down parser, and its application to the problem of language modeling for speech recognition. The paper first introduces key notions in language modeling and probabilistic parsing, and briefly reviews some previous approaches to using syntactic structure for language modeling. A lexicalized probabilistic top-down parser is then presented, which performs very well, in terms of both the accuracy of returned parses and the efficiency with which they are found, relative to the best broad-coverage statistical parsers. A new language model which utilizes probabilistic top-down parsing is then outlined, and empirical results show that it improves upon previous work in test corpus perplexity. Interpolation with a trigram model yields an exceptional improvement relative to the improvement observed by other models,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies