Benchmarking Cross-Lingual Semantic Alignment in Multilingual Embeddings
Wen G. Gong

TL;DR
This paper introduces Semantic Affinity, a new metric for evaluating cross-lingual semantic alignment in multilingual embeddings, revealing that explicit translation supervision is essential for high-quality alignment, regardless of model scale or architecture.
Contribution
The paper presents Semantic Affinity and Semanscope, providing a new benchmark for assessing true cross-lingual semantic alignment in multilingual models.
Findings
Top BERT models achieve strong alignment with translation supervision.
LLM embeddings plateau at moderate alignment levels regardless of size.
MLM-only models fail to align despite extensive multilingual training.
Abstract
With hundreds of multilingual embedding models available, practitioners lack clear guidance on which provide genuine cross-lingual semantic alignment versus task performance through language-specific patterns. Task-driven benchmarks (MTEB) may mask fundamental alignment shortcomings. We introduce Semantic Affinity (SA), a bounded (between 0 and 1) metric measuring inter-lingual to intra-lingual spread ratio using cosine distance, combined with PHATE visualization in our Semanscope framework. Benchmarking 13 models across 4 datasets (52 experiments) reveals a three-tier structure: (1) Top BERT models (LaBSE SA = 0.70, USE SA = 0.68, S-BERT SA = 0.68) achieve strong alignment via translation-pair supervision; (2) LLM embeddings plateau at SA between 0.55 and 0.61 regardless of 0.6 B to 8 B scale; (3) MLM-only BERT models (mBERT, XLM-R, SA < 0.50) fail despite more than 100 language…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Language and cultural evolution · Natural Language Processing Techniques
