Structured Safety Auditing for Balancing Code Correctness and Content Safety in LLM-Generated Code

Honghao Tan; Haibo Wang; Shin Hwei Tan

arXiv:2604.12088·cs.SE·April 15, 2026

Structured Safety Auditing for Balancing Code Correctness and Content Safety in LLM-Generated Code

Honghao Tan, Haibo Wang, Shin Hwei Tan

PDF

TL;DR

This paper introduces a structured safety auditing method and a new metric, SUDS, to balance code correctness and safety in LLM-generated code, demonstrating improved safety performance across models.

Contribution

It proposes the Dual Reasoning technique and the SUDS metric to unify safety and utility assessment, advancing responsible code generation in LLMs.

Findings

01

DR achieves highest SUDS scores across models.

02

DR's effectiveness increases with model capacity.

03

Structured reasoning complements safety vocabularies limitations.

Abstract

Large language models (LLMs) for code generation are typically evaluated on functional correctness alone, overlooking whether generated code propagates harmful content embedded in the prompt. Prior work has shown that most Code LLMs reproduce offensive identifiers from injected renaming instructions without warning, yet existing approaches focus on detecting harmful content, neglecting functional correctness. Grounded in the Theory of Dual Channel Constraints (which states that code is a dual-channel medium combining an algorithmic (AL) channel for machine execution and a natural language (NL) channel for human communication, creating a unique safety-utility trade-off where a model must balance functional execution with responsible communication), we propose NLSafety-Utility Duality Score (SUDS), a metric that unifies code utility, safety adherence, and warning awareness into a single…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.