Capturing divergence in dependency trees to improve syntactic projection

Ryan Georgi; Fei Xia; William D. Lewis

arXiv:1605.04475·cs.CL·May 17, 2016

Capturing divergence in dependency trees to improve syntactic projection

Ryan Georgi, Fei Xia, William D. Lewis

PDF

TL;DR

This paper proposes a method to automatically detect divergence in dependency structures between language pairs to enhance syntactic projection, thereby improving NLP tools for resource-poor languages.

Contribution

It introduces an automatic detection approach for divergence patterns in dependency trees that enhances projection algorithms without prior language-specific knowledge.

Findings

01

Common divergence patterns can be automatically identified.

02

Detection improves the accuracy of syntactic projection.

03

Method works without extensive annotated data.

Abstract

Obtaining syntactic parses is a crucial part of many NLP pipelines. However, most of the world's languages do not have large amounts of syntactically annotated corpora available for building parsers. Syntactic projection techniques attempt to address this issue by using parallel corpora consisting of resource-poor and resource-rich language pairs, taking advantage of a parser for the resource-rich language and word alignment between the languages to project the parses onto the data for the resource-poor language. These projection methods can suffer, however, when the two languages are divergent. In this paper, we investigate the possibility of using small, parallel, annotated corpora to automatically detect divergent structural patterns between two languages. These patterns can then be used to improve structural projection algorithms, allowing for better performing NLP tools for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.