TriQDef: Disrupting Semantic and Gradient Alignment to Prevent Adversarial Patch Transferability in Quantized Neural Networks

Amira Guesmi; Bassem Ouni; Muhammad Shafique

arXiv:2508.12132·cs.CV·August 19, 2025

TriQDef: Disrupting Semantic and Gradient Alignment to Prevent Adversarial Patch Transferability in Quantized Neural Networks

Amira Guesmi, Bassem Ouni, Muhammad Shafique

PDF

Open Access 3 Reviews

TL;DR

TriQDef is a novel defense framework for quantized neural networks that disrupts the transferability of patch-based adversarial attacks by aligning semantic and gradient representations across different quantization levels.

Contribution

It introduces a tri-level quantization-aware defense combining feature disalignment, gradient dissonance, and shared training to improve robustness against cross-bit adversarial patches.

Findings

01

Reduces attack success rate by over 40% on unseen quantization settings.

02

Maintains high accuracy on clean data.

03

Effective across CIFAR-10 and ImageNet datasets.

Abstract

Quantized Neural Networks (QNNs) are increasingly deployed in edge and resource-constrained environments due to their efficiency in computation and memory usage. While shown to distort the gradient landscape and weaken conventional pixel-level attacks, it provides limited robustness against patch-based adversarial attacks-localized, high-saliency perturbations that remain surprisingly transferable across bit-widths. Existing defenses either overfit to fixed quantization settings or fail to address this cross-bit generalization vulnerability. We introduce \textbf{TriQDef}, a tri-level quantization-aware defense framework designed to disrupt the transferability of patch-based adversarial attacks across QNNs. TriQDef consists of: (1) a Feature Disalignment Penalty (FDP) that enforces semantic inconsistency by penalizing perceptual similarity in intermediate representations; (2) a Gradient…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 3

Strengths

1. This paper addresses the critical and understudied problem of adversarial patch transferability in quantized neural networks. 2. The proposed tri-level defense is comprehensive. 3. The experiments covering various quantization bit widths, attack methods, and model architectures reflect the generalization ability of TriQDef in certain settings.

Weaknesses

1. Lack of rigorous mathematical modeling or theoretical boundary analysis on the generalizability of the proposed defense (Q1). 2. Incomplete analysis of FDP's side-effects on semantic integrity. It reports a minor accuracy drop but fails to investigate how this disalignment affects fundamental recognition capabilities (Q2-Q3). 3. Evaluation and scalability concerns (Q4-Q7).

Reviewer 02Rating 6Confidence 2

Strengths

a) The problem is well-motivated: “quantization ≠ free robustness” is a message that’s worth saying explicitly for patch attacks. b) The defense is mechanism-driven, not just “train on more patched images”: it tries to break cross-bit semantic/gradient alignment, which is a reasonable explanation for transfer. c) The idea of using perceptual cues (edge IoU, HOG-like descriptors) on features and gradients is a nice twist on older adversarial detection/defense lines that only used feature dist

Weaknesses

a) Parts of the idea are close in spirit to earlier “detect / separate / de-align” or “make internal representations less exploitable” work, but the paper doesn’t cite that line clearly. For example: MagNet (Meng & Chen, CCS’17), feature squeezing (Xu et al., NDSS’18), and transferability analyses (Tramèr et al., 2017/2018) all talk about representation/gradient similarity as a transfer channel. b) The method is engineered around patch attacks. It’s not metioned how much of TriQDef would stil

Reviewer 03Rating 2Confidence 4

Strengths

- Adversarial transferability across QNNs is an important topic. - This paper thoroughly investigates the current research frontier and proposes two novel techniques (FDP and GPDP) to address the problem. - The technical novelty of the proposed defense is sufficient.

Weaknesses

The text font size seems to noticeably decrease after Equation 2. The reviewer is not sure if this violates the ICLR policy. There are major problems in the motivation experiments and baseline comparisons. 1. Table 1 is somewhat confusing: (1) Which model architecture is used for the surrogate model? (2) Which quantization method is used? (3) The results seem not to fully support the conclusion that "Adversarial Patches Transfer Effectively Across Bit-Widths." The ASR gradually decreases as t

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Anomaly Detection Techniques and Applications