Comparing Template-based and Template-free Language Model Probing

Sagi Shaier; Kevin Bennett; Lawrence E Hunter; Katharina von der Wense

arXiv:2402.00123·cs.CL·October 31, 2024·1 cites

Comparing Template-based and Template-free Language Model Probing

Sagi Shaier, Kevin Bennett, Lawrence E Hunter, Katharina von der Wense

PDF

Open Access 1 Repo

TL;DR

This study compares template-based and template-free language model probing methods across various models and datasets, revealing significant differences in model rankings, scores, and answer consistency between the approaches.

Contribution

It provides a comprehensive evaluation of how template-based and template-free probing differ in model assessment across domains, highlighting their distinct impacts on model ranking and scoring.

Findings

01

Model rankings differ between approaches, especially outside top domain-specific models.

02

Scores can drop by up to 42% when switching approaches.

03

Perplexity correlates differently with accuracy depending on the probing method.

Abstract

The differences between cloze-task language model (LM) probing with 1) expert-made templates and 2) naturally-occurring text have often been overlooked. Here, we evaluate 16 different LMs on 10 probing English datasets -- 4 template-based and 6 template-free -- in general and biomedical domains to answer the following research questions: (RQ1) Do model rankings differ between the two approaches? (RQ2) Do models' absolute scores differ between the two approaches? (RQ3) Do the answers to RQ1 and RQ2 differ between general and domain-specific models? Our findings are: 1) Template-free and template-based approaches often rank models differently, except for the top domain-specific models. 2) Scores decrease by up to 42% Acc@1 when comparing parallel template-free and template-based prompts. 3) Perplexity is negatively correlated with accuracy in the template-free approach, but,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shaier/probing_template_based_template_free
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems