Robust stochastic parsing using the inside-outside algorithm
Briscoe, Ted, Waegner, Nick (University of Cambridge)

TL;DR
This paper presents a probabilistic parser for English POS sequences that uses the inside-outside algorithm, automatically generating compatible rules and rejecting low-probability ones to improve analysis ranking.
Contribution
It introduces a method combining linguist-defined meta-grammars with automatic rule generation and pruning for robust stochastic parsing.
Findings
The parser achieves high coverage on English POS sequences.
Automatic rule generation enhances parsing accuracy.
Rejecting low-probability rules improves analysis ranking.
Abstract
The paper describes a parser of sequences of (English) part-of-speech labels which utilises a probabilistic grammar trained using the inside-outside algorithm. The initial (meta)grammar is defined by a linguist and further rules compatible with metagrammatical constraints are automatically generated. During training, rules with very low probability are rejected yielding a wide-coverage parser capable of ranking alternative analyses. A series of corpus-based experiments describe the parser's performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Algorithms and Data Compression · Constraint Satisfaction and Optimization
