Capturing divergence in dependency trees to improve syntactic projection
Ryan Georgi, Fei Xia, William D. Lewis

TL;DR
This paper proposes a method to automatically detect divergence in dependency structures between language pairs to enhance syntactic projection, thereby improving NLP tools for resource-poor languages.
Contribution
It introduces an automatic detection approach for divergence patterns in dependency trees that enhances projection algorithms without prior language-specific knowledge.
Findings
Common divergence patterns can be automatically identified.
Detection improves the accuracy of syntactic projection.
Method works without extensive annotated data.
Abstract
Obtaining syntactic parses is a crucial part of many NLP pipelines. However, most of the world's languages do not have large amounts of syntactically annotated corpora available for building parsers. Syntactic projection techniques attempt to address this issue by using parallel corpora consisting of resource-poor and resource-rich language pairs, taking advantage of a parser for the resource-rich language and word alignment between the languages to project the parses onto the data for the resource-poor language. These projection methods can suffer, however, when the two languages are divergent. In this paper, we investigate the possibility of using small, parallel, annotated corpora to automatically detect divergent structural patterns between two languages. These patterns can then be used to improve structural projection algorithms, allowing for better performing NLP tools for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
