Learning Efficient Disambiguation

Khalil Sima'an

arXiv:cs/9906006·cs.CL·May 23, 2007·34 cites

Learning Efficient Disambiguation

Khalil Sima'an

PDF

Open Access

TL;DR

This dissertation introduces a domain-specific specialization framework for performance models like Data Oriented Parsing, aiming to improve efficiency and overcome computational limitations by focusing on limited domains and minimizing model entropy.

Contribution

It proposes the Ambiguity-Reduction Specialization (ARS) framework and algorithms for specializing DOP models, enhancing efficiency and integrating specialized models with original ones.

Findings

01

Specialized DOP models outperform original models in experiments

02

The algorithms effectively limit hypothesis space to 'safe' models

03

Specialization reduces model entropy and improves processing speed

Abstract

This dissertation analyses the computational properties of current performance-models of natural language parsing, in particular Data Oriented Parsing (DOP), points out some of their major shortcomings and suggests suitable solutions. It provides proofs that various problems of probabilistic disambiguation are NP-Complete under instances of these performance-models, and it argues that none of these models accounts for attractive efficiency properties of human language processing in limited domains, e.g. that frequent inputs are usually processed faster than infrequent ones. The central hypothesis of this dissertation is that these shortcomings can be eliminated by specializing the performance-models to the limited domains. The dissertation addresses "grammar and model specialization" and presents a new framework, the Ambiguity-Reduction Specialization (ARS) framework, that formulates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · AI-based Problem Solving and Planning · Topic Modeling