Accurate and Efficient Statistical Testing for Word Semantic Breadth

Yo Ehara

arXiv:2605.08048·cs.CL·May 11, 2026

Accurate and Efficient Statistical Testing for Word Semantic Breadth

Yo Ehara

PDF

TL;DR

This paper introduces a Householder-aligned permutation test for accurately measuring word semantic breadth, reducing false positives and improving computational efficiency in contextualized embedding analysis.

Contribution

It proposes a novel alignment-based permutation testing method that isolates dispersion differences from directional effects in word meaning analysis.

Findings

01

Reduced Type-I error by 32.5% with the new method.

02

Achieved 23x speedup over CPU baseline.

03

Improved accuracy in detecting genuine semantic breadth differences.

Abstract

Measuring the breadth of a word's meaning, or its spread across contexts, has become feasible with contextualized token embeddings. A word type can be represented as a cloud of token vectors, with dispersion-based statistics serving as proxies for contextual diversity (Nagata and Tanaka-Ishii, ACL2025). These measurements are useful for deciding appropriate sense distinctions when constructing thesauri and domain-specific dictionaries. However, when comparing the breadth of two word types, naive hypothesis testing on dispersion can be misleading: differences in semantic direction can masquerade as dispersion differences, inflating Type-I error and yielding "statistically significant" outcomes even when there is no true breadth difference. This is problematic because significance testing should distinguish genuine effects from incidental fluctuations in small-difference regimes. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.