Hiding-in-Plain-Sight (HiPS) Attack on CLIP for Targetted Object Removal   from Images

Arka Daw; Megan Hong-Thanh Chung; Maria Mahbub; Amir Sadovnik

arXiv:2410.13010·cs.LG·October 18, 2024

Hiding-in-Plain-Sight (HiPS) Attack on CLIP for Targetted Object Removal from Images

Arka Daw, Megan Hong-Thanh Chung, Maria Mahbub, Amir Sadovnik

PDF

Open Access

TL;DR

This paper introduces HiPS attacks, a novel method to subtly conceal specific objects in images by modifying model outputs, effectively removing targeted objects from captions without obvious alterations.

Contribution

The paper presents two new HiPS attack variants that can covertly hide objects in images, revealing vulnerabilities in multi-modal models like CLIP and their downstream captioning systems.

Findings

01

HiPS attacks successfully remove targeted objects from image captions.

02

The attacks transfer effectively to downstream captioning models.

03

Subtle modifications can deceive models without noticeable changes.

Abstract

Machine learning models are known to be vulnerable to adversarial attacks, but traditional attacks have mostly focused on single-modalities. With the rise of large multi-modal models (LMMs) like CLIP, which combine vision and language capabilities, new vulnerabilities have emerged. However, prior work in multimodal targeted attacks aim to completely change the model's output to what the adversary wants. In many realistic scenarios, an adversary might seek to make only subtle modifications to the output, so that the changes go unnoticed by downstream models or even by humans. We introduce Hiding-in-Plain-Sight (HiPS) attacks, a novel class of adversarial attacks that subtly modifies model predictions by selectively concealing target object(s), as if the target object was absent from the scene. We propose two HiPS attack variants, HiPS-cls and HiPS-cap, and demonstrate their effectiveness…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Multimodal Machine Learning Applications · COVID-19 diagnosis using AI

MethodsContrastive Language-Image Pre-training