TL;DR
This paper enhances the Penn Treebank by manually annotating coordination structures, adding detailed internal syntax, and correcting inconsistencies to improve syntactic parsing of coordination phrases.
Contribution
It introduces a comprehensive manual annotation of coordination structures in the Penn Treebank, addressing previous omissions and errors to facilitate better parsing research.
Findings
Extended Penn Treebank with detailed coordination annotations
Improved consistency and correctness in coordination structures
Public availability of the annotated data
Abstract
Coordination is an important and common syntactic construction which is not handled well by state of the art parsers. Coordinations in the Penn Treebank are missing internal structure in many cases, do not include explicit marking of the conjuncts and contain various errors and inconsistencies. In this work, we initiated manual annotation process for solving these issues. We identify the different elements in a coordination phrase and label each element with its function. We add phrase boundaries when these are missing, unify inconsistencies, and fix errors. The outcome is an extension of the PTB that includes consistent and detailed structures for coordinations. We make the coordination annotation publicly available, in hope that they will facilitate further research into coordination disambiguation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
