Evaluating the Effectiveness of Linguistic Knowledge in Pretrained Language Models: A Case Study of Universal Dependencies
Wenxi Li

TL;DR
This paper investigates how integrating Universal Dependencies into pretrained language models can enhance their cross-lingual performance on adversarial paraphrase tasks, showing significant accuracy improvements and better cross-lingual alignment.
Contribution
It demonstrates the effectiveness of incorporating UD into pretrained models, leading to notable performance gains and insights into cross-lingual transferability.
Findings
UD integration improves accuracy and F1 scores by 3.85% and 6.08%.
UD-based similarity correlates with model performance across languages.
UD reduces performance gaps between pretrained and large language models.
Abstract
Universal Dependencies (UD), while widely regarded as the most successful linguistic framework for cross-lingual syntactic representation, remains underexplored in terms of its effectiveness. This paper addresses this gap by integrating UD into pretrained language models and assesses if UD can improve their performance on a cross-lingual adversarial paraphrase identification task. Experimental results show that incorporation of UD yields significant improvements in accuracy and scores, with average gains of 3.85\% and 6.08\% respectively. These enhancements reduce the performance gap between pretrained models and large language models in some language pairs, and even outperform the latter in some others. Furthermore, the UD-based similarity score between a given language and English is positively correlated to the performance of models in that language. Both findings highlight the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
