IsoScore: Measuring the Uniformity of Embedding Space Utilization
William Rudman, Nate Gillman, Taylor Rayne, Carsten Eickhoff

TL;DR
IsoScore is introduced as a new, reliable metric for accurately measuring the uniformity of embedding space utilization, addressing limitations of previous methods and challenging recent conclusions in NLP.
Contribution
The paper presents IsoScore, a novel and rigorously validated tool for measuring isotropy in embedding spaces, improving upon existing methods.
Findings
IsoScore accurately measures uniformity of embedding space.
Existing metrics are unreliable for assessing isotropy.
Using IsoScore, some prior conclusions about embedding properties are challenged.
Abstract
The recent success of distributed word representations has led to an increased interest in analyzing the properties of their spatial distribution. Several studies have suggested that contextualized word embedding models do not isotropically project tokens into vector space. However, current methods designed to measure isotropy, such as average random cosine similarity and the partition score, have not been thoroughly analyzed and are not appropriate for measuring isotropy. We propose IsoScore: a novel tool that quantifies the degree to which a point cloud uniformly utilizes the ambient vector space. Using rigorously designed tests, we demonstrate that IsoScore is the only tool available in the literature that accurately measures how uniformly distributed variance is across dimensions in vector space. Additionally, we use IsoScore to challenge a number of recent conclusions in the NLP…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods
