On the Information Content of Predictions in Word Analogy Tests
Jugurta Montalv\~ao

TL;DR
This paper introduces a method to quantify the information content of analogies in word analogy tests, revealing that proximity hints are more relevant than analogies, which carry about one bit of information.
Contribution
It proposes a soft accuracy estimator that measures the relevance of analogies in terms of information content, with experimental validation on pre-trained embeddings.
Findings
Proximity hints are more relevant than analogies in analogy tests.
Analogies carry approximately one bit of information.
The method provides bias-compensated entropy estimates.
Abstract
An approach is proposed to quantify, in bits of information, the actual relevance of analogies in analogy tests. The main component of this approach is a softaccuracy estimator that also yields entropy estimates with compensated biases. Experimental results obtained with pre-trained GloVe 300-D vectors and two public analogy test sets show that proximity hints are much more relevant than analogies in analogy tests, from an information content perspective. Accordingly, a simple word embedding model is used to predict that analogies carry about one bit of information, which is experimentally corroborated.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTest · GloVe Embeddings
