Do Fine-Tuned LLMs Understand Vulnerabilities? An Investigation into the Semantic Trap

Feiyang Huang; Yuqiang Sun; Fan Zhang; Ziqi Yang; Han Liu; Yang Liu

arXiv:2601.22655·cs.CR·May 22, 2026

Do Fine-Tuned LLMs Understand Vulnerabilities? An Investigation into the Semantic Trap

Feiyang Huang, Yuqiang Sun, Fan Zhang, Ziqi Yang, Han Liu, Yang Liu

PDF

TL;DR

This paper investigates whether fine-tuned large language models truly understand software vulnerabilities or rely on superficial patterns, revealing a semantic trap that affects their reasoning and decision-making.

Contribution

The study introduces TrapEval, an evaluation framework, and identifies a semantic trap in fine-tuned LLMs, highlighting limitations of current fine-tuning methods in understanding vulnerabilities.

Findings

01

Vanilla SFT models perform well on unpaired data but fail on real-world paired data.

02

Explicit reasoning reduces symptoms but lowers recall and does not fully escape the trap.

03

Models still misinterpret control flow and hallucinate API behavior.

Abstract

Large Language Models (LLMs) have shown promising performance in software vulnerability detection, particularly after domain-specific Supervised Fine-Tuning (SFT). However, it remains unclear whether these models genuinely internalize vulnerability root causes or merely exploit surface-level functional patterns. While prior work documented related failures on pre-trained or zero-shot models, the SFT process itself, and how explicit reasoning supervision modulates it, remains under-explored. We study fine-tuned decoder-only LLMs under vanilla SFT and SFT with reasoning supervision, identifying a failure mode we term the Semantic Trap, characterized by three symptoms: pairing-sensitive performance, gap-dictated decisions, and fragility to semantic-preserving changes. To probe this, we propose TrapEval, an evaluation framework comprising two real-world datasets, V2P (vulnerable paired with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Information and Cyber Security · Web Application Security Vulnerabilities