Probing Classifiers: Promises, Shortcomings, and Advances

Yonatan Belinkov

arXiv:2102.12452·cs.CL·September 23, 2021

Probing Classifiers: Promises, Shortcomings, and Advances

Yonatan Belinkov

PDF

TL;DR

Probing classifiers are widely used to interpret NLP models by predicting linguistic properties from representations, but they face methodological limitations that this paper critically reviews, along with recent advances.

Contribution

This paper provides a critical review of probing classifiers, discussing their promises, limitations, and recent methodological improvements in interpreting NLP models.

Findings

01

Probing classifiers help analyze neural network representations.

02

Methodological limitations affect the reliability of probing results.

03

Recent advances aim to address these limitations.

Abstract

Probing classifiers have emerged as one of the prominent methodologies for interpreting and analyzing deep neural network models of natural language processing. The basic idea is simple -- a classifier is trained to predict some linguistic property from a model's representations -- and has been used to examine a wide variety of models and properties. However, recent studies have demonstrated various methodological limitations of this approach. This article critically reviews the probing classifiers framework, highlighting their promises, shortcomings, and advances.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.