Theory of Minimal Weight Perturbations in Deep Networks and its Applications for Low-Rank Activated Backdoor Attacks
Bethan Evans, Jared Tanner

TL;DR
This paper derives formulas for minimal weight perturbations in deep neural networks, analyzes their robustness, and applies these insights to backdoor attack detection and activation via low-rank compression.
Contribution
It provides exact formulas for weight perturbations, compares robustness guarantees, and demonstrates applications in backdoor attack thresholds and activation methods.
Findings
Exact formulas for minimal weight perturbations are derived.
Robustness guarantees are comparable between single-layer and multi-layer models.
Low-rank compression can reliably activate backdoors while maintaining accuracy.
Abstract
The minimal norm weight perturbations of DNNs required to achieve a specified change in output are derived and the factors determining its size are discussed. These single-layer exact formulae are contrasted with more generic multi-layer Lipschitz constant based robustness guarantees; both are observed to be of the same order which indicates similar efficacy in their guarantees. These results are applied to precision-modification-activated backdoor attacks, establishing provable compression thresholds below which such attacks cannot succeed, and show empirically that low-rank compression can reliably activate latent backdoors while preserving full-precision accuracy. These expressions reveal how back-propagated margins govern layer-wise sensitivity and provide certifiable guarantees on the smallest parameter updates consistent with a desired output shift.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Smart Grid Security and Resilience · Cryptographic Implementations and Security
