Backdoors in Neural Models of Source Code

Goutham Ramakrishnan; Aws Albarghouthi

arXiv:2006.06841·cs.LG·December 20, 2022

Backdoors in Neural Models of Source Code

Goutham Ramakrishnan, Aws Albarghouthi

PDF

1 Repo

TL;DR

This paper explores the vulnerability of neural models for source code to backdoor attacks, demonstrating how they can be inserted, detected, and eliminated across various architectures and programming languages.

Contribution

It defines backdoor classes for source code models, adapts spectral detection algorithms, and provides a comprehensive evaluation of backdoor injection and removal methods.

Findings

01

Backdoors can be easily injected into source code models.

02

Spectral signatures enable detection of poisoned data.

03

Backdoors can be effectively eliminated across architectures and languages.

Abstract

Deep neural networks are vulnerable to a range of adversaries. A particularly pernicious class of vulnerabilities are backdoors, where model predictions diverge in the presence of subtle triggers in inputs. An attacker can implant a backdoor by poisoning the training data to yield a desired target prediction on triggered inputs. We study backdoors in the context of deep-learning for source code. (1) We define a range of backdoor classes for source-code tasks and show how to poison a dataset to install such backdoors. (2) We adapt and improve recent algorithms from robust statistics for our setting, showing that backdoors leave a spectral signature in the learned representation of source code, thus enabling detection of poisoned data. (3) We conduct a thorough evaluation on different architectures and languages, showing the ease of injecting backdoors and our ability to eliminate them.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

goutham7r/backdoors-for-code
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.