SoK: Understanding (New) Security Issues Across AI4Code Use Cases
Qilong Wu, Taoran Li, Tianyang Zhou, Varun Chandrasekaran

TL;DR
This paper surveys security challenges in AI4Code systems, highlighting vulnerabilities, dataset biases, and robustness issues, and proposes future directions for integrating security into AI-driven software engineering tools.
Contribution
It provides a comprehensive analysis of security issues across AI4Code applications and suggests practical paths for embedding security into AI4Code development and evaluation.
Findings
Persistent insecure code generation patterns
Vulnerability detection is fragile to semantic attacks
Code translation can improve security when properly leveraged
Abstract
AI-for-Code (AI4Code) systems are reshaping software engineering, with tools like GitHub Copilot accelerating code generation, translation, and vulnerability detection. Alongside these advances, however, security risks remain pervasive: insecure outputs, biased benchmarks, and susceptibility to adversarial manipulation undermine their reliability. This SoK surveys the landscape of AI4Code security across three core applications, identifying recurring gaps: benchmark dominance by Python and toy problems, lack of standardized security datasets, data leakage in evaluation, and fragile adversarial robustness. A comparative study of six state-of-the-art models illustrates these challenges: insecure patterns persist in code generation, vulnerability detection is brittle to semantic-preserving attacks, fine-tuning often misaligns security objectives, and code translation yields uneven security…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Security and Verification in Computing
