Reading Isn't Believing: Adversarial Attacks On Multi-Modal Neurons

David A. Noever; Samantha E. Miller Noever

arXiv:2103.10480·cs.LG·March 22, 2021·1 cites

Reading Isn't Believing: Adversarial Attacks On Multi-Modal Neurons

David A. Noever, Samantha E. Miller Noever

PDF

Open Access

TL;DR

This paper explores new adversarial attack methods on multi-modal neural networks like CLIP, revealing vulnerabilities where conflicting text and images can cause false classifications, highlighting the model's reliance on reading over trusting visual evidence.

Contribution

It introduces novel adversarial attack categories on multi-modal models, demonstrating their susceptibility to confusing inputs and exposing the 'reading isn't believing' phenomenon.

Findings

01

Contradictory text and image signals can fool the model.

02

The model tends to read first, look later, leading to false classifications.

03

New attack types reveal vulnerabilities in multi-modal neural networks.

Abstract

With Open AI's publishing of their CLIP model (Contrastive Language-Image Pre-training), multi-modal neural networks now provide accessible models that combine reading with visual recognition. Their network offers novel ways to probe its dual abilities to read text while classifying visual objects. This paper demonstrates several new categories of adversarial attacks, spanning basic typographical, conceptual, and iconographic inputs generated to fool the model into making false or absurd classifications. We demonstrate that contradictory text and image signals can confuse the model into choosing false (visual) options. Like previous authors, we show by example that the CLIP model tends to read first, look later, a phenomenon we describe as reading isn't believing.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Digital Media Forensic Detection · Explainable Artificial Intelligence (XAI)

MethodsContrastive Language-Image Pre-training