Constrained Decoding for Secure Code Generation
Yanjun Fu, Ethan Baker, Yu Ding, Yizheng Chen

TL;DR
This paper introduces constrained decoding techniques and new evaluation metrics to improve the security and correctness of code generated by large language models, addressing a critical gap in secure code generation.
Contribution
It proposes constrained decoding methods for secure code generation and introduces CodeGuard+ benchmark with new metrics to evaluate security and correctness.
Findings
Constrained decoding outperforms prefix tuning in security without sacrificing correctness.
Different decoding methods significantly impact the security of Code LLMs.
Constrained decoding surpasses GPT-4 in security performance.
Abstract
Code Large Language Models (Code LLMs) have been increasingly used by developers to boost productivity, but they often generate vulnerable code. Thus, there is an urgent need to ensure that code generated by Code LLMs is correct and secure. Previous research has primarily focused on generating secure code, overlooking the fact that secure code also needs to be correct. This oversight can lead to a false sense of security. Currently, the community lacks a method to measure actual progress in this area, and we need solutions that address both security and correctness of code generation. This paper introduces a new benchmark, CodeGuard+, along with two new metrics, to measure Code LLMs' ability to generate both secure and correct code. Using our new evaluation methods, we show that the state-of-the-art defense technique, prefix tuning, may not be as strong as previously believed, since…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCryptographic Implementations and Security · Coding theory and cryptography · Advanced Malware Detection Techniques
MethodsAttention Is All You Need · Softmax · Layer Normalization · Linear Layer · Byte Pair Encoding · Label Smoothing · Adam · Residual Connection · Multi-Head Attention · Position-Wise Feed-Forward Layer
