SABER: Model-agnostic Backdoor Attack on Chain-of-Thought in Neural Code   Generation

Naizhu Jin; Zhong Li; Yinggang Guo; Chao Su; Tian Zhang; Qingkai; Zeng

arXiv:2412.05829·cs.SE·March 11, 2025

SABER: Model-agnostic Backdoor Attack on Chain-of-Thought in Neural Code Generation

Naizhu Jin, Zhong Li, Yinggang Guo, Chao Su, Tian Zhang, Qingkai, Zeng

PDF

Open Access 1 Repo

TL;DR

This paper uncovers security vulnerabilities in Chain-of-Thought models for code generation, demonstrating a novel backdoor attack method SABER that is highly effective and stealthy, raising concerns about model safety.

Contribution

Introduces SABER, a model-agnostic backdoor attack leveraging self-attention, to demonstrate the susceptibility of CoT models to data poisoning in code generation tasks.

Findings

01

SABER achieves an ASR of 80.95% on HumanEval-CoT.

02

SABER bypasses 61.90% of automated detection methods.

03

Backdoors can be effectively injected into CoT models, compromising security.

Abstract

Recent studies have proposed integrating Chain-of-Thought (CoT) reasoning to further enhance the reliability of Code Language Models (CLMs) in generating code, a step-by-step approach that breaks down complex programming tasks into manageable sub-problems. Advances in this area have introduced CoT models, specifically designed to integrate CoT reasoning effectively into language models, achieving notable improvements in code generation. Despite these advancements, the security of CoT models has not been systematically studied. In this study, we aim to fill this gap by investigating the vulnerability of CoT models to backdoor injection in code generation tasks. To address this, we propose a model-agnostic backdoor attack method SABER (Self-Attention-BasEd backdooR) based on the self-attention mechanism. SABER begins by selecting a malicious output as the backdoor using code mutation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

WolfgangJin/CoTbackdoor_SABER
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques