Security of Language Models for Code: A Systematic Literature Review

Yuchen Chen; Weisong Sun; Chunrong Fang; Zhenpeng Chen; Yifei Ge; Tingxu Han; Quanjun Zhang; Yang Liu; Zhenyu Chen; and Baowen Xu

arXiv:2410.15631·cs.SE·May 20, 2025·3 cites

Security of Language Models for Code: A Systematic Literature Review

Yuchen Chen, Weisong Sun, Chunrong Fang, Zhenpeng Chen, Yifei Ge, Tingxu Han, Quanjun Zhang, Yang Liu, Zhenyu Chen, and Baowen Xu

PDF

Open Access 1 Repo

TL;DR

This paper systematically reviews 67 studies on the security vulnerabilities of language models for code, highlighting attack and defense strategies, datasets, tools, and future research directions.

Contribution

It provides the first comprehensive survey of security issues in CodeLMs, organizing existing research and identifying key challenges and open problems.

Findings

01

Attack strategies and defense mechanisms are evolving rapidly.

02

Common datasets and evaluation metrics are identified.

03

Open-source tools for security assessment are highlighted.

Abstract

Language models for code (CodeLMs) have emerged as powerful tools for code-related tasks, outperforming traditional methods and standard machine learning approaches. However, these models are susceptible to security vulnerabilities, drawing increasing research attention from domains such as software engineering, artificial intelligence, and cybersecurity. Despite the growing body of research focused on the security of CodeLMs, a comprehensive survey in this area remains absent. To address this gap, we systematically review 67 relevant papers, organizing them based on attack and defense strategies. Furthermore, we provide an overview of commonly used language models, datasets, and evaluation metrics, and highlight open-source tools and promising directions for future research in securing CodeLMs.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wssun/tise-lm4code-security
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital and Cyber Forensics · Advanced Malware Detection Techniques · Web Application Security Vulnerabilities