Cross-lingual Similarity of Multilingual Representations Revisited

Maksym Del; Mark Fishel

arXiv:2212.01924·cs.CL·December 6, 2022·1 cites

Cross-lingual Similarity of Multilingual Representations Revisited

Maksym Del, Mark Fishel

PDF

Open Access 1 Repo

TL;DR

This paper critically examines existing similarity measures for cross-lingual representations, introduces a new metric ANC, and demonstrates its effectiveness in analyzing multilingual models' transfer patterns.

Contribution

It identifies limitations of CKA/CCA in cross-lingual analysis, proposes ANC as a better alternative, and applies it to reveal transfer patterns in various multilingual models.

Findings

01

CKA/CCA fail to capture key aspects of cross-lingual similarity

02

ANC provides a more accurate measure for cross-lingual similarity

03

The 'first align, then predict' pattern is observed in both MLMs and CLMs, including scaled models.

Abstract

Related works used indexes like CKA and variants of CCA to measure the similarity of cross-lingual representations in multilingual language models. In this paper, we argue that assumptions of CKA/CCA align poorly with one of the motivating goals of cross-lingual learning analysis, i.e., explaining zero-shot cross-lingual transfer. We highlight what valuable aspects of cross-lingual similarity these indexes fail to capture and provide a motivating case study \textit{demonstrating the problem empirically}. Then, we introduce \textit{Average Neuron-Wise Correlation (ANC)} as a straightforward alternative that is exempt from the difficulties of CKA/CCA and is good specifically in a cross-lingual context. Finally, we use ANC to construct evidence that the previously introduced ``first align, then predict'' pattern takes place not only in masked language models (MLMs) but also in multilingual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

TartuNLP/xsim
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

Methodsfail · ALIGN