Factual Probing Is [MASK]: Learning vs. Learning to Recall

Zexuan Zhong; Dan Friedman; Danqi Chen

arXiv:2104.05240·cs.CL·December 15, 2021·22 cites

Factual Probing Is [MASK]: Learning vs. Learning to Recall

Zexuan Zhong, Dan Friedman, Danqi Chen

PDF

Open Access 2 Repos

TL;DR

This paper introduces OptiPrompt, a continuous optimization method that improves factual probing in language models, and investigates whether prompt-based methods truly measure learned knowledge or exploit training data regularities.

Contribution

It presents OptiPrompt, a novel prompt optimization technique, and critically examines the interpretation of probing results as true measures of factual knowledge.

Findings

01

OptiPrompt improves fact prediction by 6.4% on LAMA benchmark.

02

Prompt methods can exploit regularities in training data, not just learned knowledge.

03

Disentangling learning from recall reveals limitations of current probing techniques.

Abstract

Petroni et al. (2019) demonstrated that it is possible to retrieve world facts from a pre-trained language model by expressing them as cloze-style prompts and interpret the model's prediction accuracy as a lower bound on the amount of factual information it encodes. Subsequent work has attempted to tighten the estimate by searching for better prompts, using a disjoint set of facts as training data. In this work, we make two complementary contributions to better understand these factual probing techniques. First, we propose OptiPrompt, a novel and efficient method which directly optimizes in continuous embedding space. We find this simple method is able to predict an additional 6.4% of facts in the LAMA benchmark. Second, we raise a more important question: Can we really interpret these probing results as a lower bound? Is it possible that these prompt-search methods learn from the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning

MethodsTanh Activation · Softmax · Low-Rank Factorization-based Multi-Head Attention