Implicit Word Reordering with Knowledge Distillation for Cross-Lingual Dependency Parsing
Zhuoran Li, Chunming Hu, Junfan Chen, Zhijun Chen, Richong Zhang

TL;DR
This paper introduces IWR-KD, a knowledge distillation approach that implicitly learns word reordering for cross-lingual dependency parsing, improving robustness across multiple languages without explicit reordering.
Contribution
It proposes a novel implicit reordering framework using knowledge distillation, addressing limitations of existing explicit reordering methods in cross-lingual parsing.
Findings
Outperforms existing methods on 31 languages
Enhances parser robustness to word order variations
Demonstrates effectiveness of implicit reordering with knowledge distillation
Abstract
Word order difference between source and target languages is a major obstacle to cross-lingual transfer, especially in the dependency parsing task. Current works are mostly based on order-agnostic models or word reordering to mitigate this problem. However, such methods either do not leverage grammatical information naturally contained in word order or are computationally expensive as the permutation space grows exponentially with the sentence length. Moreover, the reordered source sentence with an unnatural word order may be a form of noising that harms the model learning. To this end, we propose an Implicit Word Reordering framework with Knowledge Distillation (IWR-KD). This framework is inspired by that deep networks are good at learning feature linearization corresponding to meaningful data transformation, e.g. word reordering. To realize this idea, we introduce a knowledge…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
MethodsKnowledge Distillation
