Adversarial Semantic and Label Perturbation Attack for Pedestrian Attribute Recognition
Weizhe Kong, Xiao Wang, Ruichong Gao, Chenglong Li, Yu Zhang, Xing Yang, Yaowei Wang, Jin Tang

TL;DR
This paper introduces the first adversarial attack and defense framework for pedestrian attribute recognition, exploiting semantic and label perturbations to evaluate and improve model robustness across digital and physical domains.
Contribution
It proposes a novel adversarial attack and defense framework for PAR using multi-modal transformers and CLIP, addressing vulnerability issues in pedestrian attribute recognition models.
Findings
Effective adversarial attacks on PAR models demonstrated
Semantic offset defense improves robustness against attacks
Validated on multiple digital and physical datasets
Abstract
Pedestrian Attribute Recognition (PAR) is an indispensable task in human-centered research and has made great progress in recent years with the development of deep neural networks. However, the potential vulnerability and anti-interference ability have still not been fully explored. To bridge this gap, this paper proposes the first adversarial attack and defense framework for pedestrian attribute recognition. Specifically, we exploit both global- and patch-level attacks on the pedestrian images, based on the pre-trained CLIP-based PAR framework. It first divides the input pedestrian image into non-overlapping patches and embeds them into feature embeddings using a projection layer. Meanwhile, the attribute set is expanded into sentences using prompts and embedded into attribute features using a pre-trained CLIP text encoder. A multi-modal Transformer is adopted to fuse the obtained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
MethodsAttention Is All You Need · Linear Layer · Dense Connections · ADaptive gradient method with the OPTimal convergence rate · Contrastive Language-Image Pre-training · Softmax · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Label Smoothing · Multi-Head Attention
