A Non-Linear Structural Probe
Jennifer C. White, Tiago Pimentel, Naomi Saphra, Ryan Cotterell

TL;DR
This paper introduces a non-linear variant of a structural probe for syntactic knowledge in language models, demonstrating improved performance across multiple languages by leveraging kernel methods, specifically the RBF kernel.
Contribution
It develops a novel non-linear structural probe using kernelization, outperforming linear probes in encoding syntactic structures across six languages.
Findings
RBF kernel improves probe accuracy significantly
Non-linear encoding captures more syntactic information
Performance gains are consistent across languages
Abstract
Probes are models devised to investigate the encoding of knowledge -- e.g. syntactic structure -- in contextual representations. Probes are often designed for simplicity, which has led to restrictions on probe design that may not allow for the full exploitation of the structure of encoded information; one such restriction is linearity. We examine the case of a structural probe (Hewitt and Manning, 2019), which aims to investigate the encoding of syntactic structure in contextual representations through learning only linear transformations. By observing that the structural probe learns a metric, we are able to kernelize it and develop a novel non-linear variant with an identical number of parameters. We test on 6 languages and find that the radial-basis function (RBF) kernel, in conjunction with regularization, achieves a statistically significant improvement over the baseline in all…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
