Training Dependency Parsers with Partial Annotation
Zhenghua Li, Yue Zhang, Jiayuan Chao, Min Zhang

TL;DR
This paper explores methods for training dependency parsers with partial annotations, comparing graph-based and transition-based approaches, and demonstrates that graph-based methods are more effective in learning from partial data.
Contribution
It introduces a novel approach for training linear graph-based and transition-based parsers with partial annotations using constrained decoding, and systematically compares them to existing methods.
Findings
Graph-based parser (LLGPar) outperforms others in learning from partial annotations.
Linear graph-based (LGPar) and transition-based parsers (LTPar) improve with full annotation completion.
Extensive experiments validate the effectiveness of the proposed methods.
Abstract
Recently, these has been a surge on studying how to obtain partially annotated data for model supervision. However, there still lacks a systematic study on how to train statistical models with partial annotation (PA). Taking dependency parsing as our case study, this paper describes and compares two straightforward approaches for three mainstream dependency parsers. The first approach is previously proposed to directly train a log-linear graph-based parser (LLGPar) with PA based on a forest-based objective. This work for the first time proposes the second approach to directly training a linear graph-based parse (LGPar) and a linear transition-based parser (LTPar) with PA based on the idea of constrained decoding. We conduct extensive experiments on Penn Treebank under three different settings for simulating PA, i.e., random dependencies, most uncertain dependencies, and dependencies…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Biomedical Text Mining and Ontologies
