Sort by Structure: Language Model Ranking as Dependency Probing
Max M\"uller-Eberstein, Rob van der Goot, Barbara Plank

TL;DR
This paper introduces a probing method to rank pre-trained language models based on their ability to recover dependency structures, enabling efficient model selection for parsing tasks with minimal computational cost.
Contribution
It presents a novel dependency probing approach for LM ranking, covering diverse models and languages, and demonstrates its effectiveness in predicting the best models for parsing.
Findings
Probing predicts the best LM 79% of the time with less compute.
RemBERT contains less dependency info but performs well after fine-tuning.
Without RemBERT, the method identifies the best LM in 89% of cases.
Abstract
Making an informed choice of pre-trained language model (LM) is critical for performance, yet environmentally costly, and as such widely underexplored. The field of Computer Vision has begun to tackle encoder ranking, with promising forays into Natural Language Processing, however they lack coverage of linguistic tasks such as structured prediction. We propose probing to rank LMs, specifically for parsing dependencies in a given language, by measuring the degree to which labeled trees are recoverable from an LM's contextualized embeddings. Across 46 typologically and architecturally diverse LM-language pairs, our probing approach predicts the best LM choice 79% of the time using orders of magnitude less compute than training a full parser. Within this study, we identify and analyze one recently proposed decoupled LM - RemBERT - and find it strikingly contains less inherent dependency…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
