A Non-Linear Structural Probe

Jennifer C. White; Tiago Pimentel; Naomi Saphra; Ryan Cotterell

arXiv:2105.10185·cs.CL·May 24, 2021

A Non-Linear Structural Probe

Jennifer C. White, Tiago Pimentel, Naomi Saphra, Ryan Cotterell

PDF

TL;DR

This paper introduces a non-linear variant of a structural probe for syntactic knowledge in language models, demonstrating improved performance across multiple languages by leveraging kernel methods, specifically the RBF kernel.

Contribution

It develops a novel non-linear structural probe using kernelization, outperforming linear probes in encoding syntactic structures across six languages.

Findings

01

RBF kernel improves probe accuracy significantly

02

Non-linear encoding captures more syntactic information

03

Performance gains are consistent across languages

Abstract

Probes are models devised to investigate the encoding of knowledge -- e.g. syntactic structure -- in contextual representations. Probes are often designed for simplicity, which has led to restrictions on probe design that may not allow for the full exploitation of the structure of encoded information; one such restriction is linearity. We examine the case of a structural probe (Hewitt and Manning, 2019), which aims to investigate the encoding of syntactic structure in contextual representations through learning only linear transformations. By observing that the structural probe learns a metric, we are able to kernelize it and develop a novel non-linear variant with an identical number of parameters. We test on 6 languages and find that the radial-basis function (RBF) kernel, in conjunction with regularization, achieves a statistically significant improvement over the baseline in all…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.