DeepGuard: Secure Code Generation via Multi-Layer Semantic Aggregation

Li Huang; Zhongxin Liu; Yifan Wu; Tao Yin; Dong Li; Jichao Bi; Nankun Mu; Hongyu Zhang; Meng Yan

arXiv:2604.09089·cs.SE·April 13, 2026

DeepGuard: Secure Code Generation via Multi-Layer Semantic Aggregation

Li Huang, Zhongxin Liu, Yifan Wu, Tao Yin, Dong Li, Jichao Bi, Nankun Mu, Hongyu Zhang, Meng Yan

PDF

1 Repo

TL;DR

DeepGuard enhances code generation security by aggregating multi-layer signals from large language models, improving vulnerability detection without sacrificing correctness.

Contribution

It introduces a multi-layer aggregation framework with attention-based modules to better detect and mitigate vulnerabilities during code generation.

Findings

01

DeepGuard improves secure-and-correct generation rate by 11.9% on average.

02

It preserves functional correctness while generalizing to new vulnerability types.

03

The method leverages distributed cues from multiple layers, outperforming single-layer baselines.

Abstract

Large Language Models (LLMs) for code generation can replicate insecure patterns from their training data. To mitigate this, a common strategy for security hardening is to fine-tune models using supervision derived from the final transformer layer. However, this design may suffer from a final-layer bottleneck: vulnerability-discriminative cues can be distributed across layers and become less detectable near the output representations optimized for next-token prediction. To diagnose this issue, we perform layer-wise linear probing. We observe that vulnerability-related signals are most detectable in a band of intermediate-to-upper layers yet attenuate toward the final layers. Motivated by this observation, we introduce DeepGuard, a framework that leverages distributed security-relevant cues by aggregating representations from multiple upper layers via an attention-based module. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

unknownhl/DeepGuard
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.