Comparing Template-based and Template-free Language Model Probing
Sagi Shaier, Kevin Bennett, Lawrence E Hunter, Katharina von der Wense

TL;DR
This study compares template-based and template-free language model probing methods across various models and datasets, revealing significant differences in model rankings, scores, and answer consistency between the approaches.
Contribution
It provides a comprehensive evaluation of how template-based and template-free probing differ in model assessment across domains, highlighting their distinct impacts on model ranking and scoring.
Findings
Model rankings differ between approaches, especially outside top domain-specific models.
Scores can drop by up to 42% when switching approaches.
Perplexity correlates differently with accuracy depending on the probing method.
Abstract
The differences between cloze-task language model (LM) probing with 1) expert-made templates and 2) naturally-occurring text have often been overlooked. Here, we evaluate 16 different LMs on 10 probing English datasets -- 4 template-based and 6 template-free -- in general and biomedical domains to answer the following research questions: (RQ1) Do model rankings differ between the two approaches? (RQ2) Do models' absolute scores differ between the two approaches? (RQ3) Do the answers to RQ1 and RQ2 differ between general and domain-specific models? Our findings are: 1) Template-free and template-based approaches often rank models differently, except for the top domain-specific models. 2) Scores decrease by up to 42% Acc@1 when comparing parallel template-free and template-based prompts. 3) Perplexity is negatively correlated with accuracy in the template-free approach, but,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
