Combining semantic and syntactic structure for language modeling

Rens Bod

arXiv:cs/0110051·cs.CL·May 23, 2007·3 cites

Combining semantic and syntactic structure for language modeling

Rens Bod

PDF

Open Access

TL;DR

This paper demonstrates that incorporating non-headword dependencies via a data-oriented parsing model trained on semantic and syntactic data significantly improves speech recognition accuracy, addressing limitations of previous structured language models.

Contribution

It introduces a novel DOP model trained with maximum likelihood reestimation, effectively capturing non-headword dependencies for better language modeling.

Findings

01

Non-headword dependencies improve word error rate

02

A new DOP model trained with maximum likelihood

03

Structured models benefit from semantic and syntactic info

Abstract

Structured language models for speech recognition have been shown to remedy the weaknesses of n-gram models. All current structured language models are, however, limited in that they do not take into account dependencies between non-headwords. We show that non-headword dependencies contribute to significantly improved word error rate, and that a data-oriented parsing model trained on semantically and syntactically annotated data can exploit these dependencies. This paper also contains the first DOP model trained by means of a maximum likelihood reestimation procedure, which solves some of the theoretical shortcomings of previous DOP models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems