Examining Multilingual Embedding Models Cross-Lingually Through LLM-Generated Adversarial Examples

Andrianos Michail; Simon Clematide; Rico Sennrich

arXiv:2502.08638·cs.CL·October 10, 2025

Examining Multilingual Embedding Models Cross-Lingually Through LLM-Generated Adversarial Examples

Andrianos Michail, Simon Clematide, Rico Sennrich

PDF

Open Access

TL;DR

This paper introduces CLSD, a new lightweight evaluation method using LLM-generated adversarial examples to assess cross-lingual embedding models' semantic discrimination capabilities, revealing model sensitivities and transfer behaviors.

Contribution

It proposes CLSD, a novel evaluation task for cross-lingual models using adversarial distractors generated by LLMs, and provides insights into model performance and linguistic sensitivity.

Findings

01

Models fine-tuned for retrieval benefit from pivoting through English.

02

Bitext mining models excel in direct cross-lingual settings.

03

Embedding models show varying sensitivity to linguistic perturbations.

Abstract

The evaluation of cross-lingual semantic search models is often limited to existing datasets from tasks such as information retrieval and semantic textual similarity. We introduce Cross-Lingual Semantic Discrimination (CLSD), a lightweight evaluation task that requires only parallel sentences and a Large Language Model (LLM) to generate adversarial distractors. CLSD measures an embedding model's ability to rank the true parallel sentence above semantically misleading but lexically similar alternatives. As a case study, we construct CLSD datasets for German--French in the news domain. Our experiments show that models fine-tuned for retrieval tasks benefit from pivoting through English, whereas bitext mining models perform best in direct cross-lingual settings. A fine-grained similarity analysis further reveals that embedding models differ in their sensitivity to linguistic perturbations.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Misinformation and Its Impacts

MethodsSparse Evolutionary Training