Against All Odds: Overcoming Typology, Script, and Language Confusion in   Multilingual Embedding Inversion Attacks

Yiyi Chen; Russa Biswas; Heather Lent; Johannes Bjerva

arXiv:2408.11749·cs.CL·December 17, 2024

Against All Odds: Overcoming Typology, Script, and Language Confusion in Multilingual Embedding Inversion Attacks

Yiyi Chen, Russa Biswas, Heather Lent, Johannes Bjerva

PDF

Open Access 1 Repo

TL;DR

This paper investigates the security vulnerabilities of multilingual LLMs against embedding inversion attacks across diverse languages, scripts, and language families, revealing specific vulnerabilities and patterns of language confusion that impact attack efficacy.

Contribution

It systematically analyzes cross-lingual and cross-script inversion attacks on 20 languages, uncovering language-specific vulnerabilities and patterns of model confusion, advancing understanding of multilingual LLM security.

Findings

01

Languages in Arabic and Cyrillic scripts are highly vulnerable.

02

Inversion models often confuse languages, reducing attack success.

03

Certain language families, like Indo-Aryan, are more susceptible.

Abstract

Large Language Models (LLMs) are susceptible to malicious influence by cyber attackers through intrusions such as adversarial, backdoor, and embedding inversion attacks. In response, the burgeoning field of LLM Security aims to study and defend against such threats. Thus far, the majority of works in this area have focused on monolingual English models, however, emerging research suggests that multilingual LLMs may be more vulnerable to various attacks than their monolingual counterparts. While previous work has investigated embedding inversion over a small subset of European languages, it is challenging to extrapolate these findings to languages from different linguistic families and with differing scripts. To this end, we explore the security of multilingual LLMs in the context of embedding inversion attacks and investigate cross-lingual and cross-script inversion across 20 languages,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

siebeniris/vec2text_exp
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Interpreting and Communication in Healthcare · Natural Language Processing Techniques