Exploiting Diversity in Natural Language Processing: Combining Parsers

John C. Henderson; Eric Brill

arXiv:cs/0006003·cs.CL·May 23, 2007·106 cites

Exploiting Diversity in Natural Language Processing: Combining Parsers

John C. Henderson, Eric Brill

PDF

Open Access

TL;DR

This paper explores combining multiple statistical parsers using various models to improve parsing accuracy, achieving new state-of-the-art results on the Penn Treebank dataset.

Contribution

It introduces two general approaches and four combination techniques, exploring both parametric and non-parametric models for parser ensemble methods.

Findings

01

Combined parsers outperform individual state-of-the-art parsers.

02

New bounds on achievable Treebank parsing accuracy are established.

03

Achieved the best published performance on the Penn Treebank.

Abstract

Three state-of-the-art statistical parsers are combined to produce more accurate parses, as well as new bounds on achievable Treebank parsing accuracy. Two general approaches are presented and two combination techniques are described for each approach. Both parametric and non-parametric models are explored. The resulting parsers surpass the best previously published performance results for the Penn Treebank.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification