Exploring Anisotropy and Outliers in Multilingual Language Models for Cross-Lingual Semantic Sentence Similarity
Katharina H\"ammerl, Alina Fastowski, Jind\v{r}ich Libovick\'y,, Alexander Fraser

TL;DR
This paper investigates the anisotropy and outlier dimensions in multilingual language models, exploring how these factors affect cross-lingual semantic similarity and proposing space transformations to improve representation quality.
Contribution
It analyzes outlier dimensions in multilingual models and demonstrates that simple embedding space transformations can enhance isotropy and performance without fine-tuning.
Findings
Transformations like removing outliers improve isotropy.
Fine-tuned models exhibit more isotropic representations.
Space transformations can close performance gaps without additional training.
Abstract
Previous work has shown that the representations output by contextual language models are more anisotropic than static type embeddings, and typically display outlier dimensions. This seems to be true for both monolingual and multilingual models, although much less work has been done on the multilingual context. Why these outliers occur and how they affect the representations is still an active area of research. We investigate outlier dimensions and their relationship to anisotropy in multiple pre-trained multilingual language models. We focus on cross-lingual semantic similarity tasks, as these are natural tasks for evaluating multilingual representations. Specifically, we examine sentence representations. Sentence transformers which are fine-tuned on parallel resources (that are not always available) perform better on this task, and we show that their representations are more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
MethodsTest · Focus
