Extracting Recurring Vulnerabilities from Black-Box LLM-Generated Software
Tomer Kordonsky, Maayan Yamin, Noam Benzimra, Amit LeVi, and Avi Mendelson

TL;DR
This paper introduces FSTab, a method to predict and evaluate vulnerabilities in LLM-generated software by analyzing observable features, revealing significant security risks across multiple domains.
Contribution
FSTab is a novel black-box attack and evaluation framework that predicts vulnerabilities in LLM-generated code without access to backend details.
Findings
FSTab achieves up to 94% attack success rate.
FSTab covers 93% of vulnerabilities across domains.
Strong cross-domain transferability of the method.
Abstract
LLMs are increasingly used for code generation, but their outputs often follow recurring templates that can induce predictable vulnerabilities. We study vulnerability persistence in LLM-generated software and introduce Feature--Security Table (FSTab) with two components. First, FSTab enables a black-box attack that predicts likely backend vulnerabilities from observable frontend features and knowledge of the source LLM, without access to the backend or source code. Second, FSTab provides a model-centric evaluation that quantifies how consistently a model reproduces the same vulnerabilities across programs, semantics-preserving rephrasings, and application domains. We evaluate FSTab on state-of-the-art code LLMs, including GPT-5.2, Claude-4.5 Opus, and Gemini-3 Pro, across diverse application domains. Our results show strong cross-domain transfer: even when the target domain is excluded…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Web Application Security Vulnerabilities · Advanced Malware Detection Techniques
