Security-by-Design for LLM-Based Code Generation: Leveraging Internal Representations for Concept-Driven Steering Mechanisms

Maximilian Wendlinger; Daniel Kowatsch; Konstantin B\"ottinger; Philip Sperl

arXiv:2603.11212·cs.CR·March 13, 2026

Security-by-Design for LLM-Based Code Generation: Leveraging Internal Representations for Concept-Driven Steering Mechanisms

Maximilian Wendlinger, Daniel Kowatsch, Konstantin B\"ottinger, Philip Sperl

PDF

Open Access

TL;DR

This paper explores the internal representations of security concepts in CodeLLMs, revealing their awareness of vulnerabilities and enabling a new steering mechanism that improves secure code generation.

Contribution

It introduces Secure Concept Steering for CodeLLMs (SCS-Code), a novel method leveraging internal model representations to enhance security in generated code.

Findings

01

SCS-Code outperforms existing methods on secure coding benchmarks.

02

CodeLLMs can internally recognize security vulnerabilities.

03

Fine-grained analysis of internal representations enables targeted security improvements.

Abstract

Large Language Models (LLMs) show remarkable capabilities in understanding natural language and generating complex code. However, as practitioners adopt CodeLLMs for increasingly critical development tasks, research reveals that these models frequently generate functionally correct yet insecure code, posing significant security risks. While multiple approaches have been proposed to improve security in AI-based code generation, combined benchmarks show these methods remain insufficient for practical use, achieving only limited improvements in both functional correctness and security. This stems from a fundamental gap in understanding the internal mechanisms of code generation and the root causes of security vulnerabilities, forcing researchers to rely on heuristics and empirical observations. In this work, we investigate the internal representation of security concepts in CodeLLMs,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Adversarial Robustness in Machine Learning · Software Engineering Research