Adversarial Illusions in Multi-Modal Embeddings
Tingwei Zhang, Rishi Jha, Eugene Bagdasaryan, Vitaly Shmatikov

TL;DR
This paper reveals vulnerabilities in multi-modal embeddings where adversaries can perturb inputs to create cross-modal, targeted illusions that mislead various AI tasks, highlighting security concerns across modalities.
Contribution
It introduces the concept of adversarial illusions in multi-modal embeddings, demonstrating their effectiveness and transferability, and presents the first attack on Amazon's proprietary Titan embedding.
Findings
Adversarial illusions can mislead image and audio tasks.
The attacks transfer across different embedding models.
Countermeasures and evasion strategies are analyzed.
Abstract
Multi-modal embeddings encode texts, images, thermal images, sounds, and videos into a single embedding space, aligning representations across different modalities (e.g., associate an image of a dog with a barking sound). In this paper, we show that multi-modal embeddings can be vulnerable to an attack we call "adversarial illusions." Given an image or a sound, an adversary can perturb it to make its embedding close to an arbitrary, adversary-chosen input in another modality. These attacks are cross-modal and targeted: the adversary can align any image or sound with any target of his choice. Adversarial illusions exploit proximity in the embedding space and are thus agnostic to downstream tasks and modalities, enabling a wholesale compromise of current and future tasks, as well as modalities not available to the adversary. Using ImageBind and AudioCLIP embeddings, we demonstrate how…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Digital Media Forensic Detection · Bacillus and Francisella bacterial research
MethodsALIGN
