Context Variance Evaluation of Pretrained Language Models for   Prompt-based Biomedical Knowledge Probing

Zonghai Yao; Yi Cao; Zhichao Yang; Hong Yu

arXiv:2211.10265·cs.CL·January 26, 2023·5 cites

Context Variance Evaluation of Pretrained Language Models for Prompt-based Biomedical Knowledge Probing

Zonghai Yao, Yi Cao, Zhichao Yang, Hong Yu

PDF

Open Access

TL;DR

This paper introduces context variance prompts and a new evaluation metric to improve the reliability of probing biomedical knowledge in pretrained language models, addressing biases and challenges in existing methods.

Contribution

It proposes a novel context variance approach and the UCM metric, enhancing the evaluation of PLMs' biomedical knowledge, especially for large-N-M and rare relations.

Findings

01

Context variance prompts improve robustness in knowledge probing.

02

UCM metric captures model understanding beyond simple recall.

03

Enhanced evaluation stability for large-N-M and rare relations.

Abstract

Pretrained language models (PLMs) have motivated research on what kinds of knowledge these models learn. Fill-in-the-blanks problem (e.g., cloze tests) is a natural approach for gauging such knowledge. BioLAMA generates prompts for biomedical factual knowledge triples and uses the Top-k accuracy metric to evaluate different PLMs' knowledge. However, existing research has shown that such prompt-based knowledge probing methods can only probe a lower bound of knowledge. Many factors like prompt-based probing biases make the LAMA benchmark unreliable and unstable. This problem is more prominent in BioLAMA. The severe long-tailed distribution in vocabulary and large-N-M relation make the performance gap between LAMA and BioLAMA remain notable. To address these, we introduce context variance into the prompt generation and propose a new rank-change-based evaluation metric. Different from the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Machine Learning in Healthcare

MethodsTanh Activation · Softmax · Low-Rank Factorization-based Multi-Head Attention