A comparative evaluation and analysis of three generations of Distributional Semantic Models
Alessandro Lenci, Magnus Sahlgren, Patrick Jeuniaux, Amaru, Cuba Gyllensten, Martina Miliani

TL;DR
This paper compares three generations of Distributional Semantic Models, revealing that static models often outperform contextualized ones in semantic tasks and highlighting factors influencing their behavior through statistical and neuroscientific analyses.
Contribution
It provides a comprehensive evaluation of static and contextualized DSMs, challenging assumptions about the superiority of predict models and analyzing semantic space representations.
Findings
Static DSMs outperform contextualized models in many semantic tasks.
Predict models' supposed superiority is less evident and not universal.
RSA uncovers differences based on frequency and part-of-speech of words.
Abstract
Distributional semantics has deeply changed in the last decades. First, predict models stole the thunder from traditional count ones, and more recently both of them were replaced in many NLP applications by contextualized vectors produced by Transformer neural language models. Although an extensive body of research has been devoted to Distributional Semantic Model (DSM) evaluation, we still lack a thorough comparison with respect to tested models, semantic tasks, and benchmark datasets. Moreover, previous work has mostly focused on task-driven evaluation, instead of exploring the differences between the way models represent the lexical semantic space. In this paper, we perform a comprehensive evaluation of type distributional vectors, either produced by static DSMs or obtained by averaging the contextualized vectors generated by BERT. First of all, we investigate the performance of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsAttention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Linear Warmup With Linear Decay · Attention Dropout · WordPiece · Dropout · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia?
