Reading Isn't Believing: Adversarial Attacks On Multi-Modal Neurons
David A. Noever, Samantha E. Miller Noever

TL;DR
This paper explores new adversarial attack methods on multi-modal neural networks like CLIP, revealing vulnerabilities where conflicting text and images can cause false classifications, highlighting the model's reliance on reading over trusting visual evidence.
Contribution
It introduces novel adversarial attack categories on multi-modal models, demonstrating their susceptibility to confusing inputs and exposing the 'reading isn't believing' phenomenon.
Findings
Contradictory text and image signals can fool the model.
The model tends to read first, look later, leading to false classifications.
New attack types reveal vulnerabilities in multi-modal neural networks.
Abstract
With Open AI's publishing of their CLIP model (Contrastive Language-Image Pre-training), multi-modal neural networks now provide accessible models that combine reading with visual recognition. Their network offers novel ways to probe its dual abilities to read text while classifying visual objects. This paper demonstrates several new categories of adversarial attacks, spanning basic typographical, conceptual, and iconographic inputs generated to fool the model into making false or absurd classifications. We demonstrate that contradictory text and image signals can confuse the model into choosing false (visual) options. Like previous authors, we show by example that the CLIP model tends to read first, look later, a phenomenon we describe as reading isn't believing.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Digital Media Forensic Detection · Explainable Artificial Intelligence (XAI)
MethodsContrastive Language-Image Pre-training
