Can Language Models be Biomedical Knowledge Bases?

Mujeen Sung; Jinhyuk Lee; Sean Yi; Minji Jeon; Sungdong Kim; Jaewoo; Kang

arXiv:2109.07154·cs.CL·September 16, 2021·1 cites

Can Language Models be Biomedical Knowledge Bases?

Mujeen Sung, Jinhyuk Lee, Sean Yi, Minji Jeon, Sungdong Kim, Jaewoo, Kang

PDF

Open Access 1 Repo

TL;DR

This paper introduces BioLAMA, a benchmark for evaluating biomedical knowledge contained in language models, revealing their limited ability to serve as reliable domain-specific knowledge bases.

Contribution

The creation of BioLAMA as a new benchmark and analysis of biomedical LMs' knowledge retrieval capabilities.

Findings

01

Biomedical LMs achieve up to 18.51% accuracy on BioLAMA.

02

Predictions are heavily influenced by prompt templates, limiting true knowledge extraction.

03

Most predictions lack subject-specific information, reducing their usefulness as KBs.

Abstract

Pre-trained language models (LMs) have become ubiquitous in solving various natural language processing (NLP) tasks. There has been increasing interest in what knowledge these LMs contain and how we can extract that knowledge, treating LMs as knowledge bases (KBs). While there has been much work on probing LMs in the general domain, there has been little attention to whether these powerful LMs can be used as domain-specific KBs. To this end, we create the BioLAMA benchmark, which is comprised of 49K biomedical factual knowledge triples for probing biomedical LMs. We find that biomedical LMs with recently proposed probing methods can achieve up to 18.51% Acc@5 on retrieving biomedical knowledge. Although this seems promising given the task difficulty, our detailed analyses reveal that most predictions are highly correlated with prompt templates without any subjects, hence producing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dmis-lab/biolama
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Biomedical Text Mining and Ontologies