Rotations and Interpretability of Word Embeddings: the Case of the   Russian Language

Alexey Zobnin

arXiv:1707.04662·cs.CL·May 28, 2019

Rotations and Interpretability of Word Embeddings: the Case of the Russian Language

Alexey Zobnin

PDF

TL;DR

This paper explores how orthogonal transformations can improve the interpretability and stability of word embedding components for Russian language models, enhancing their usefulness for linguistic analysis.

Contribution

It demonstrates that specific orthogonal transformations can enhance component interpretability and stability in Russian word embeddings, a novel approach in this context.

Findings

01

Orthogonal transformations can increase component interpretability.

02

Transformations improve stability of embeddings under re-learning.

03

Applicable to multiple Russian language embedding models.

Abstract

Consider a continuous word embedding model. Usually, the cosines between word vectors are used as a measure of similarity of words. These cosines do not change under orthogonal transformations of the embedding space. We demonstrate that, using some canonical orthogonal transformations from SVD, it is possible both to increase the meaning of some components and to make the components more stable under re-learning. We study the interpretability of components for publicly available models for the Russian language (RusVectores, fastText, RDT).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsInterpretability · fastText