Against All Odds: Overcoming Typology, Script, and Language Confusion in Multilingual Embedding Inversion Attacks
Yiyi Chen, Russa Biswas, Heather Lent, Johannes Bjerva

TL;DR
This paper investigates the security vulnerabilities of multilingual LLMs against embedding inversion attacks across diverse languages, scripts, and language families, revealing specific vulnerabilities and patterns of language confusion that impact attack efficacy.
Contribution
It systematically analyzes cross-lingual and cross-script inversion attacks on 20 languages, uncovering language-specific vulnerabilities and patterns of model confusion, advancing understanding of multilingual LLM security.
Findings
Languages in Arabic and Cyrillic scripts are highly vulnerable.
Inversion models often confuse languages, reducing attack success.
Certain language families, like Indo-Aryan, are more susceptible.
Abstract
Large Language Models (LLMs) are susceptible to malicious influence by cyber attackers through intrusions such as adversarial, backdoor, and embedding inversion attacks. In response, the burgeoning field of LLM Security aims to study and defend against such threats. Thus far, the majority of works in this area have focused on monolingual English models, however, emerging research suggests that multilingual LLMs may be more vulnerable to various attacks than their monolingual counterparts. While previous work has investigated embedding inversion over a small subset of European languages, it is challenging to extrapolate these findings to languages from different linguistic families and with differing scripts. To this end, we explore the security of multilingual LLMs in the context of embedding inversion attacks and investigate cross-lingual and cross-script inversion across 20 languages,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Interpreting and Communication in Healthcare · Natural Language Processing Techniques
