Three Generative, Lexicalised Models for Statistical Parsing
Michael Collins (University of Pennsylvania)

TL;DR
This paper introduces a new generative lexicalized parsing model that incorporates subcategorization and wh-movement, achieving improved accuracy on Wall Street Journal data.
Contribution
It presents a novel generative lexicalized parsing model with probabilistic handling of syntactic phenomena, enhancing parsing performance.
Findings
Achieves 88.1% precision and 87.5% recall on WSJ data
Improves over previous models by 2.3% in accuracy
Incorporates probabilistic treatment of subcategorisation and wh-movement
Abstract
In this paper we first propose a new statistical parsing model, which is a generative model of lexicalised context-free grammar. We then extend the model to include a probabilistic treatment of both subcategorisation and wh-movement. Results on Wall Street Journal text show that the parser performs at 88.1/87.5% constituent precision/recall, an average improvement of 2.3% over (Collins 96).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
