Tracing the Evolution of Word Embedding Techniques in Natural Language Processing

Minh Anh Nguyen; Kuheli Sai; and Minh Nguyen

arXiv:2603.13271·cs.CY·March 17, 2026

Tracing the Evolution of Word Embedding Techniques in Natural Language Processing

Minh Anh Nguyen, Kuheli Sai, and Minh Nguyen

PDF

Open Access

TL;DR

This paper provides a comprehensive review and bibliometric analysis of word embedding techniques in NLP over seven decades, highlighting a paradigm shift post-GPT-3 with increased industry involvement and new methods.

Contribution

It offers the first detailed methodological survey and quantitative era comparison of word embedding evolution, emphasizing the impact of GPT-3 and large language models.

Findings

01

Contextual and sentence embeddings now dominate research.

02

Mean team sizes have increased significantly post-GPT-3.

03

30 new techniques emerged while 54 older methods declined.

Abstract

This work traces the evolution of word-embedding techniques within the natural language processing (NLP) literature. We collect and analyze 149 research articles spanning the period from 1954 to 2025, providing both a comprehensive methodological review and a data-driven bibliometric analysis of how representation learning has developed over seven decades. Our study covers four major embedding paradigms, statistical representation-based methods (one-hot encoding, bag-of-words, TF-IDF), static word embeddings (Word2Vec, GloVe, FastText), contextual word embeddings (ELMo, BERT, GPT), and sentence/document embeddings, critically discussing the strengths, limitations, and intellectual lineage connecting each category. Beyond the methodological survey, we conduct a formal era comparison using GPT-3's release as a dividing line, applying seven hypothesis tests to quantify shifts in research…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods · Natural Language Processing Techniques