Architectural Backdoors in Neural Networks

Mikel Bober-Irizar; Ilia Shumailov; Yiren Zhao; Robert Mullins,; Nicolas Papernot

arXiv:2206.07840·cs.LG·June 17, 2022·1 cites

Architectural Backdoors in Neural Networks

Mikel Bober-Irizar, Ilia Shumailov, Yiren Zhao, Robert Mullins,, Nicolas Papernot

PDF

Open Access 2 Repos

TL;DR

This paper introduces a novel class of backdoor attacks in neural networks that embed malicious functionalities directly into model architectures, posing a significant threat due to their persistence even after retraining.

Contribution

It formalizes the concept of architectural backdoors, demonstrating their feasibility, threat level, and potential defenses across various computer vision benchmarks.

Findings

01

Architectural backdoors can survive complete retraining.

02

They are easy to implement via open-source code.

03

Vulnerable across different training settings.

Abstract

Machine learning is vulnerable to adversarial manipulation. Previous literature has demonstrated that at the training stage attackers can manipulate data and data sampling procedures to control model behaviour. A common attack goal is to plant backdoors i.e. force the victim model to learn to recognise a trigger known only by the adversary. In this paper, we introduce a new class of backdoor attacks that hide inside model architectures i.e. in the inductive bias of the functions used to train. These backdoors are simple to implement, for instance by publishing open-source code for a backdoored model architecture that others will reuse unknowingly. We demonstrate that model architectural backdoors represent a real threat and, unlike other approaches, can survive a complete re-training from scratch. We formalise the main construction principles behind architectural backdoors, such as a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Explainable Artificial Intelligence (XAI)