Distributional Analysis of Polysemous Function Words
Sebastian Pado, Daniel Hole

TL;DR
This paper explores how modern contextualized word embeddings can effectively model the polysemy of function words, specifically demonstrating their ability to capture different senses of the German reflexive pronoun 'sich' based on usage contexts.
Contribution
It shows that contextualized embeddings can successfully represent the polysemous nature of function words, challenging the traditional view that they are unsuitable for distributional analysis.
Findings
Contextualized embeddings capture multiple senses of 'sich' in German.
Embeddings reflect systematic linguistic usage patterns.
Function words' polysemy can be modeled with modern distributional methods.
Abstract
In this paper, we are concerned with the phenomenon of function word polysemy. We adopt the framework of distributional semantics, which characterizes word meaning by observing occurrence contexts in large corpora and which is in principle well situated to model polysemy. Nevertheless, function words were traditionally considered as impossible to analyze distributionally due to their highly flexible usage patterns. We establish that contextualized word embeddings, the most recent generation of distributional methods, offer hope in this regard. Using the German reflexive pronoun 'sich' as an example, we find that contextualized word embeddings capture theoretically motivated word senses for 'sich' to the extent to which these senses are mirrored systematically in linguistic usage.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLinguistic research and analysis · Natural Language Processing Techniques · Linguistic Variation and Morphology
