RESCUE: Retrieval Augmented Secure Code Generation
Jiahao Shi, Tianyi Zhang

TL;DR
RESCUE is a novel retrieval-augmented framework that enhances secure code generation by constructing a hierarchical security knowledge base and employing multi-faceted retrieval, significantly improving security performance across multiple benchmarks.
Contribution
It introduces a hybrid knowledge base construction and hierarchical retrieval method, addressing noise and security semantics in secure code generation with LLMs.
Findings
Rescue improves SecurePass@1 by an average of 4.8 points.
Achieves state-of-the-art performance on security benchmarks.
Validated through extensive ablation studies.
Abstract
Despite recent advances, Large Language Models (LLMs) still generate vulnerable code. Retrieval-Augmented Generation (RAG) has the potential to enhance LLMs for secure code generation by incorporating external security knowledge. However, the conventional RAG design struggles with the noise of raw security-related documents, and existing retrieval methods overlook the significant security semantics implicitly embedded in task descriptions. To address these issues, we propose \textsc{Rescue}, a new RAG framework for secure code generation with two key innovations. First, we propose a hybrid knowledge base construction method that combines LLM-assisted cluster-then-summarize distillation with program slicing, producing both high-level security guidelines and concise, security-focused code examples. Second, we design a hierarchical multi-faceted retrieval that traverses the constructed…
Peer Reviews
Decision·ICLR 2026 Poster
- Core approach is technically sound with clear motivation - Comprehensive experiments (4 benchmarks, 6 LLMs, 5 baselines) - Ablation studies validating the main components - The paper is generally well-organized structurally and is easy to understand
- The novelty of the contribution seems to be in applying the different components like cluster-and-summarize knowledge base with heirarchical retrieval. The method involves too many moving components like API pattern and vulnerability cause analysis. The gains based on the complexity of the system seems not much significant, which would limit the adoption of such an approach for secure code generation. - The method involves many different hyperparameters like hop limit, thresholds for api and v
- RESCUE consistently outperforms all existing methods across multiple benchmarks in terms of SecurePass@1, a comprehensive metric that jointly evaluates both functionality and security. This demonstrates the framework's ability to generate code that is not only correct but also resistant to common vulnerabilities, representing a significant advancement over prior approaches that often sacrifice one dimension for the other. - The paper proposes a novel and systematic method for building a refine
- The evaluation framework used in the paper, which relies on static security analysis tools, presents a key limitation: The evaluation framework employs static security analysis tools, which are known to potentially generate false positives and negatives - RESCUE introduces some additional time cost to achieve its security improvements. Although the paper suggest the overhead is acceptable and can be further reduced through engineering optimizations, the additional costs are mostly unclear
- This paper focuses on addressing an important question - This paper's results has shown substantially improvement compared with baselines - This paper design a comprehensive retrieval approach for related vulnerabilities
Generally, the paper tries to address an important security problem while some concerns about the evaluation setting exist. 1.1 Evaluation Benchmark. The main concern of this paper is that the evaluation benchmarks are programming-contest benchmarks (HE, BCB, LCB). These benchmarks are mainly self-contained, and mostly function-level. Avoiding vulnerabilities in these benchmarks are not convincing and this paper can be substantially improved after including real-world code-generation/-completio
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Software Engineering Research · Digital and Cyber Forensics
