Targeted Downstream-Agnostic Attack

Zhuxin Lei; Ziyuan Yang; Yi Zhang

arXiv:2605.19446·cs.CV·May 20, 2026

Targeted Downstream-Agnostic Attack

Zhuxin Lei, Ziyuan Yang, Yi Zhang

PDF

TL;DR

This paper introduces a targeted downstream-agnostic attack method on pre-trained encoders, using example-specific perturbations and a threat image to reveal vulnerabilities across multiple datasets and models.

Contribution

It proposes a novel targeted DAA approach with example-specific perturbations and a threat image, improving attack success and invisibility under a stricter threat model.

Findings

01

Effective across 10 self-supervised methods and 3 datasets

02

High attack success rate and invisibility achieved

03

Reveals significant vulnerabilities of pre-trained encoders

Abstract

Recently, pre-trained encoders have gained widespread use due to their strong capability in representation extraction. However, they are vulnerable to downstream-agnostic attacks (DAAs). Existing DAA methods operate under a permissive threat model, where an attack is successful if the generated downstream-agnostic adversarial examples (DAEs) change the original prediction, without requiring a specific target. In this paper, we propose a Targeted DAA (TDAA) method under a stricter threat model requiring the attack to be both targeted and downstream-agnostic. Since the downstream task is unknown and encoders do not directly produce predictions, achieving a targeted attack is particularly challenging. To address this, we introduce a novel component termed the 'threat image', pre-selected by the attacker as the target. Specifically, a generator is designed to produce example-specific…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.