A Mixture of Linear Corrections Generates Secure Code
Weichen Yu, Ravi Mangal, Terry Zhuo, Matt Fredrikson, Corina S. Pasareanu

TL;DR
This paper introduces a mixture of corrections (MoC) technique that leverages internal vulnerability representations in LLMs to generate more secure code, significantly reducing vulnerabilities without sacrificing functionality.
Contribution
It reveals that LLMs encode vulnerability-related concepts and proposes a novel inference-time steering method to improve code security during generation.
Findings
MoC improves security ratio of Qwen2.5-Coder-7B by 8.9%.
MoC enhances HumanEval pass@1 by 2.1%.
LLMs encode precise internal vulnerability representations.
Abstract
Large language models (LLMs) have become proficient at sophisticated code-generation tasks, yet remain ineffective at reliably detecting or avoiding code vulnerabilities. Does this deficiency stem from insufficient learning about code vulnerabilities, or is it merely a result of ineffective prompting? Using representation engineering techniques, we investigate whether LLMs internally encode the concepts necessary to identify code vulnerabilities. We find that current LLMs encode precise internal representations that distinguish vulnerable from secure code--achieving greater accuracy than standard prompting approaches. Leveraging these vulnerability-sensitive representations, we develop an inference-time steering technique that subtly modulates the model's token-generation probabilities through a mixture of corrections (MoC). Our method effectively guides LLMs to produce less vulnerable…
Peer Reviews
Decision·Submitted to ICLR 2026
- MoC is an inference-time steering technique that effectively guides LLMs to produce less vulnerable code. Notably, it enhanced the security ratio while simultaneously improving functionality on HumanEval. - The method is a practical approach to controlled vulnerability management that does not require costly retraining or extensive prompt engineering. - The guiding correction vectors sometimes transfer across models, yielding a computationally efficient way to harden models that are not specif
- The primary evaluation tool, CodeQL, exhibits inherent limitations in both accuracy and computational efficiency. The paper notes a scarcity of robust automated evaluation methods for code generation, and finding that using an LLM-as-a-judge is unsuitable due to poor performance in code vulnerability detection - The paper requires fully open-source access to the model's internal representations and parameters. This dependency on white-box access limits the practical applicability of MoC to pro
- The paper studies the important problems of vulnerability detection and secure code generation with LLMs. - The paper is well-written and the key ideas are easy to understand. - The use of linear probing to detect vulnerabilities is novel.
- Some of the steering methods considered in Section 3.2.1 have been proposed for other natural language tasks (difference of group mean [1], normal vector of the decision boundary [2]). - The secure code generation task reports the security ratio metric but does not report the correctness of the outputs after steering with MoC on the SVEN Test Set (Table 6). There could potentially be a trade-off between the security ratio and the correctness of generation after steering (similar to the accura
- Employs linear probing on hidden representations, achieving better bug detection accuracy than prompt-based baselines. - Introduces a Mixture of Corrections that improve code security while maintaining functionality. - Demonstrates that the learned correction vectors exhibit a certain degree of cross-model transferability.
- Although four types of corrections are proposed, the paper does not clearly describe how they are combined for a given bug type. Are all four used simultaneously, or is only one applied each time? - Each correction is trained specifically for one bug type (CWE). This means for multiple bug types, separate probes and corrections must be trained, potentially increasing computational overhead and raising questions about interaction or interference between corrections when multiple vulnerabilities
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAntenna Design and Analysis
