Better Benchmarking LLMs for Zero-Shot Dependency Parsing
Ana Ezquerro, Carlos G\'omez-Rodr\'iguez, David Vilares

TL;DR
This paper evaluates open-weight LLMs on zero-shot dependency parsing, revealing that most do not outperform uninformed baselines except the largest LLaMA models, indicating limited zero-shot syntactic parsing capabilities.
Contribution
It provides a comprehensive benchmarking of open-weight LLMs on zero-shot dependency parsing, introducing new baseline comparisons and highlighting current limitations.
Findings
Most LLMs do not outperform uninformed baselines.
Only the newest and largest LLaMA models succeed across most languages.
Zero-shot syntactic parsing remains challenging for open LLMs.
Abstract
While LLMs excel in zero-shot tasks, their performance in linguistic challenges like syntactic parsing has been less scrutinized. This paper studies state-of-the-art open-weight LLMs on the task by comparing them to baselines that do not have access to the input sentence, including baselines that have not been used in this context such as random projective trees or optimal linear arrangements. The results show that most of the tested LLMs cannot outperform the best uninformed baselines, with only the newest and largest versions of LLaMA doing so for most languages, and still achieving rather low performance. Thus, accurate zero-shot syntactic parsing is not forthcoming with open LLMs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Machine Learning and Data Classification
