On SkipGram Word Embedding Models with Negative Sampling: Unified Framework and Impact of Noise Distributions

Dezhi Liu; Richong Zhang; Ziqiao Wang

arXiv:2009.04413·cs.CL·December 3, 2025

On SkipGram Word Embedding Models with Negative Sampling: Unified Framework and Impact of Noise Distributions

Dezhi Liu, Richong Zhang, Ziqiao Wang

PDF

Open Access

TL;DR

This paper introduces a unified framework for SkipGram word embedding models with negative sampling, analyzes the effect of noise distributions, and proposes new models that outperform existing ones.

Contribution

It formulates the Word-Context Classification framework, studies the impact of noise distributions, and discovers novel models with improved performance.

Findings

01

Optimal noise distribution is the data distribution.

02

The framework generalizes existing models.

03

New models outperform previous WCC models.

Abstract

SkipGram word embedding models with negative sampling, or SGN in short, is an elegant family of word embedding models. In this paper, we formulate a framework for word embedding, referred to as Word-Context Classification (WCC), that generalizes SGN to a wide family of models. The framework, which uses some ``noise examples'', is justified through theoretical analysis. The impact of noise distribution on the learning of the WCC embedding models is studied experimentally, suggesting that the best noise distribution is, in fact, the data distribution, in terms of both the embedding performance and the speed of convergence during training. Along our way, we discover several novel embedding models that outperform existing WCC models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis