Learning Unification-Based Natural Language Grammars
Miles Osborne (Dept. of Computer Science, University of York, York,, England)

TL;DR
This paper presents a system combining data-driven and model-based learning to improve unification-based grammars for natural language parsing, effectively reducing undergeneration and overgeneration issues.
Contribution
It introduces a hybrid learning approach that leverages both data-driven and model-based methods to enhance grammar adequacy for practical parsing tasks.
Findings
Hybrid approach reduces undergeneration in grammars
System improves parse plausibility and correctness
Empirical results confirm hypothesis effectiveness
Abstract
When parsing unrestricted language, wide-covering grammars often undergenerate. Undergeneration can be tackled either by sentence correction, or by grammar correction. This thesis concentrates upon automatic grammar correction (or machine learning of grammar) as a solution to the problem of undergeneration. Broadly speaking, grammar correction approaches can be classified as being either {\it data-driven}, or {\it model-based}. Data-driven learners use data-intensive methods to acquire grammar. They typically use grammar formalisms unsuited to the needs of practical text processing and cannot guarantee that the resulting grammar is adequate for subsequent semantic interpretation. That is, data-driven learners acquire grammars that generate strings that humans would judge to be grammatically ill-formed (they {\it overgenerate}) and fail to assign linguistically plausible parses.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Machine Learning and Algorithms · Algorithms and Data Compression
