Inference-Time Safety For Code LLMs Via Retrieval-Augmented Revision

Manisha Mukherjee; Vincent J. Hellendoorn

arXiv:2603.01494·cs.SE·March 3, 2026

Inference-Time Safety For Code LLMs Via Retrieval-Augmented Revision

Manisha Mukherjee, Vincent J. Hellendoorn

PDF

Open Access

TL;DR

This paper introduces a retrieval-augmented revision method for code LLMs that enhances safety and trustworthiness by surfacing relevant security risks and guiding code revision during inference, without retraining.

Contribution

The paper proposes a novel inference-time safety mechanism using retrieval-augmented generation to improve code security and interpretability in LLMs.

Findings

01

Improves security of generated code compared to prompting alone

02

No new vulnerabilities introduced as per static analysis

03

Enhances robustness to evolving security standards

Abstract

Large Language Models (LLMs) are increasingly deployed for code generation in high-stakes software development, yet their limited transparency in security reasoning and brittleness to evolving vulnerability patterns raise critical trustworthiness concerns. Models trained on static datasets cannot readily adapt to newly discovered vulnerabilities or changing security standards without retraining, leading to the repeated generation of unsafe code. We present a principled approach to trustworthy code generation by design that operates as an inference-time safety mechanism. Our approach employs retrieval-augmented generation to surface relevant security risks in generated code and retrieve related security discussions from a curated Stack Overflow knowledge base, which are then used to guide an LLM during code revision. This design emphasizes three aspects relevant to trustworthiness: (1)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Adversarial Robustness in Machine Learning · Scientific Computing and Data Management