Achieving Semantic Consistency: Contextualized Word Representations for   Political Text Analysis

Ruiyu Zhang; Lin Nie; Ce Zhao; Qingyang Chen

arXiv:2412.04505·cs.CL·January 22, 2025·2 cites

Achieving Semantic Consistency: Contextualized Word Representations for Political Text Analysis

Ruiyu Zhang, Lin Nie, Ce Zhao, Qingyang Chen

PDF

Open Access

TL;DR

This paper compares static Word2Vec and contextual BERT embeddings in political text analysis, demonstrating BERT's superior semantic stability over 20 years of news articles, suitable for tasks requiring consistent meaning interpretation.

Contribution

It provides an empirical comparison showing BERT's enhanced semantic stability over Word2Vec in political text analysis across a long time span.

Findings

01

BERT outperforms Word2Vec in semantic stability.

02

BERT captures subtle semantic variations.

03

BERT is more reliable for stable semantic analysis.

Abstract

Accurately interpreting words is vital in political science text analysis; some tasks require assuming semantic stability, while others aim to trace semantic shifts. Traditional static embeddings, like Word2Vec effectively capture long-term semantic changes but often lack stability in short-term contexts due to embedding fluctuations caused by unbalanced training data. BERT, which features transformer-based architecture and contextual embeddings, offers greater semantic consistency, making it suitable for analyses in which stability is crucial. This study compares Word2Vec and BERT using 20 years of People's Daily articles to evaluate their performance in semantic representations across different timeframes. The results indicate that BERT outperforms Word2Vec in maintaining semantic stability and still recognizes subtle semantic variations. These findings support BERT's use in text…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational and Text Analysis Methods

MethodsAttention Is All You Need · Softmax · Linear Layer · Linear Warmup With Linear Decay · Multi-Head Attention · Weight Decay · WordPiece · Layer Normalization · Residual Connection · Adam