GRILL: Restoring Gradient Signal in Ill-Conditioned Layers for More Effective Adversarial Attacks on Autoencoders

Chethan Krishnamurthy Ramanaik; Arjun Roy; Tobias Callies; Eirini Ntoutsi

arXiv:2505.03646·cs.LG·February 24, 2026

GRILL: Restoring Gradient Signal in Ill-Conditioned Layers for More Effective Adversarial Attacks on Autoencoders

Chethan Krishnamurthy Ramanaik, Arjun Roy, Tobias Callies, Eirini Ntoutsi

PDF

TL;DR

This paper introduces GRILL, a method that restores gradient signals in ill-conditioned autoencoder layers, enabling more effective adversarial attacks and revealing greater vulnerabilities in autoencoders and similar architectures.

Contribution

GRILL is a novel technique that addresses vanishing gradients in ill-conditioned layers, improving the strength of adversarial attacks on autoencoders and related models.

Findings

01

GRILL significantly enhances attack effectiveness on autoencoders.

02

Autoencoders are more vulnerable to adversarial attacks than previously understood.

03

Modern encoder-decoder architectures exhibit similar susceptibilities.

Abstract

Adversarial robustness of deep autoencoders (AEs) has received less attention than that of discriminative models, although their compressed latent representations induce ill-conditioned mappings that can amplify small input perturbations and destabilize reconstructions. Existing white-box attacks for AEs, which optimize norm-bounded adversarial perturbations to maximize output damage, often stop at suboptimal attacks. We observe that this limitation stems from vanishing adversarial loss gradients during backpropagation through ill-conditioned layers, caused by near-zero singular values in their Jacobians. To address this issue, we introduce GRILL, a technique that locally restores gradient signals in ill-conditioned layers, enabling more effective norm-bounded attacks. Through extensive experiments across multiple AE architectures, considering both sample-specific and universal attacks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAutoencoders