GRILL: Restoring Gradient Signal in Ill-Conditioned Layers for More Effective Adversarial Attacks on Autoencoders
Chethan Krishnamurthy Ramanaik, Arjun Roy, Tobias Callies, Eirini Ntoutsi

TL;DR
This paper introduces GRILL, a method that restores gradient signals in ill-conditioned autoencoder layers, enabling more effective adversarial attacks and revealing greater vulnerabilities in autoencoders and similar architectures.
Contribution
GRILL is a novel technique that addresses vanishing gradients in ill-conditioned layers, improving the strength of adversarial attacks on autoencoders and related models.
Findings
GRILL significantly enhances attack effectiveness on autoencoders.
Autoencoders are more vulnerable to adversarial attacks than previously understood.
Modern encoder-decoder architectures exhibit similar susceptibilities.
Abstract
Adversarial robustness of deep autoencoders (AEs) has received less attention than that of discriminative models, although their compressed latent representations induce ill-conditioned mappings that can amplify small input perturbations and destabilize reconstructions. Existing white-box attacks for AEs, which optimize norm-bounded adversarial perturbations to maximize output damage, often stop at suboptimal attacks. We observe that this limitation stems from vanishing adversarial loss gradients during backpropagation through ill-conditioned layers, caused by near-zero singular values in their Jacobians. To address this issue, we introduce GRILL, a technique that locally restores gradient signals in ill-conditioned layers, enabling more effective norm-bounded attacks. Through extensive experiments across multiple AE architectures, considering both sample-specific and universal attacks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAutoencoders
