Using $k$-way Co-occurrences for Learning Word Embeddings
Danushka Bollegala, Yuichi Yoshida, Ken-ichi Kawarabayashi

TL;DR
This paper extends word co-occurrence analysis to $k$-way interactions, providing a theoretical relationship and a new learning objective, leading to embeddings that perform well even with sparse data.
Contribution
It introduces a theoretical framework and a novel learning method for $k$-way co-occurrences in word embeddings, improving performance on various tasks.
Findings
Theoretical relationship between joint probability and embeddings established.
$k$-way embeddings perform comparably or better than 2-way embeddings.
Empirical validation of the theoretical relationship.
Abstract
Co-occurrences between two words provide useful insights into the semantics of those words. Consequently, numerous prior work on word embedding learning have used co-occurrences between two words as the training signal for learning word embeddings. However, in natural language texts it is common for multiple words to be related and co-occurring in the same context. We extend the notion of co-occurrences to cover -way co-occurrences among a set of -words. Specifically, we prove a theoretical relationship between the joint probability of words, and the sum of norms of their embeddings. Next, we propose a learning objective motivated by our theoretical result that utilises -way co-occurrences for learning word embeddings. Our experimental results show that the derived theoretical relationship does indeed hold empirically, and despite data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
