On the Adversarial Robustness of Multi-Modal Foundation Models

Christian Schlarmann; Matthias Hein

arXiv:2308.10741·cs.LG·August 22, 2023·5 cites

On the Adversarial Robustness of Multi-Modal Foundation Models

Christian Schlarmann, Matthias Hein

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that imperceivable adversarial attacks on multi-modal foundation models can mislead honest users, highlighting the need for robust defenses against such attacks in deployed systems.

Contribution

The paper reveals a new security vulnerability in multi-modal foundation models caused by imperceivable image attacks affecting output accuracy.

Findings

01

Imperceivable attacks can alter model captions misleading users.

02

Malicious content can exploit these attacks to guide users to harmful sites.

03

Countermeasures are necessary for safe deployment of multi-modal models.

Abstract

Multi-modal foundation models combining vision and language models such as Flamingo or GPT-4 have recently gained enormous interest. Alignment of foundation models is used to prevent models from providing toxic or harmful output. While malicious users have successfully tried to jailbreak foundation models, an equally important question is if honest users could be harmed by malicious third-party content. In this paper we show that imperceivable attacks on images in order to change the caption output of a multi-modal foundation model can be used by malicious content providers to harm honest users e.g. by guiding them to malicious websites or broadcast fake information. This indicates that countermeasures to adversarial attacks should be used by any deployed multi-modal foundation model.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chs20/robustvlm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Adversarial Robustness in Machine Learning · COVID-19 diagnosis using AI

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Adam · Label Smoothing · Layer Normalization · Softmax · Dense Connections