Apportioning Development Effort in a Probabilistic LR Parsing System through Evaluation
John Carroll (University of Sussex), Ted Briscoe (University of, Cambridge)

TL;DR
This paper presents a probabilistic LR parser for English that evaluates the contribution of each component to overall performance, enabling targeted improvements and achieving around 80% parsing success on diverse texts.
Contribution
It introduces a method to apportion development effort in a probabilistic parsing system through detailed evaluation of individual components.
Findings
Achieves 80% parsing coverage on diverse corpus
Mean crossing bracket rate of 0.71 on sample sentences
Recall and precision around 83-84% on manual analyses
Abstract
We describe an implemented system for robust domain-independent syntactic parsing of English, using a unification-based grammar of part-of-speech and punctuation labels coupled with a probabilistic LR parser. We present evaluations of the system's performance along several different dimensions; these enable us to assess the contribution that each individual part is making to the success of the system as a whole, and thus prioritise the effort to be devoted to its further enhancement. Currently, the system is able to parse around 80% of sentences in a substantial corpus of general text containing a number of distinct genres. On a random sample of 250 such sentences the system has a mean crossing bracket rate of 0.71 and recall and precision of 83% and 84% respectively when evaluated against manually-disambiguated analyses.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
