A Generative Adversarial Approach to Adversarial Attacks Guided by Contrastive Language-Image Pre-trained Model
Sampriti Soor, Alik Pramanick, Jothiprakash K, Arijit Sur

TL;DR
This paper introduces a novel generative adversarial attack method leveraging CLIP to craft highly effective, visually imperceptible adversarial examples that deceive models while maintaining image fidelity.
Contribution
It presents a new attack approach combining CLIP-guided loss with perturbation strategies, enhancing attack effectiveness and visual similarity in multi-object scenes.
Findings
Achieves competitive or superior attack success rates.
Maintains high structural similarity and visual fidelity.
Effective across diverse black-box models.
Abstract
The rapid growth of deep learning has brought about powerful models that can handle various tasks, like identifying images and understanding language. However, adversarial attacks, an unnoticed alteration, can deceive models, leading to inaccurate predictions. In this paper, a generative adversarial attack method is proposed that uses the CLIP model to create highly effective and visually imperceptible adversarial perturbations. The CLIP model's ability to align text and image representation helps incorporate natural language semantics with a guided loss to generate effective adversarial examples that look identical to the original inputs. This integration allows extensive scene manipulation, creating perturbations in multi-object environments specifically designed to deceive multilabel classifiers. Our approach integrates the concentrated perturbation strategy from Saliency-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Hate Speech and Cyberbullying Detection
