TL;DR
This paper introduces a method to explore and analyze the geometry of BERT's contextualized vector space by using pseudowords to understand how different word senses are represented and how sense regions are organized.
Contribution
The authors propose a novel approach using pseudowords to investigate the geometry of BERT's space around individual word instances, revealing regularities and sense voids.
Findings
Regions in BERT space correspond to different word senses
Identified sense voids where no clear sense is represented
Regular patterns in the geometry of contextualized embeddings
Abstract
We present a method for exploring regions around individual points in a contextualized vector space (particularly, BERT space), as a way to investigate how these regions correspond to word senses. By inducing a contextualized "pseudoword" as a stand-in for a static embedding in the input layer, and then performing masked prediction of a word in the sentence, we are able to investigate the geometry of the BERT-space in a controlled manner around individual instances. Using our method on a set of carefully constructed sentences targeting ambiguous English words, we find substantial regularity in the contextualized space, with regions that correspond to distinct word senses; but between these regions there are occasionally "sense voids" -- regions that do not correspond to any intelligible sense.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Dense Connections · Softmax · Weight Decay · Residual Connection · Layer Normalization · WordPiece
