Trust Me, I Know This Function: Hijacking LLM Static Analysis using Bias
Shir Bernstein, David Beste, Daniel Ayzenshteyn, Lea Schonherr, Yisroel Mirsky

TL;DR
This paper uncovers a vulnerability in LLM-based static code analysis caused by an abstraction bias, allowing adversaries to hijack model interpretation with minimal code edits, affecting multiple models and languages.
Contribution
It introduces the concept of Familiar Pattern Attacks (FPA), a novel automated black-box method to exploit and demonstrate this bias across various LLMs and programming languages.
Findings
FPAs are highly effective against multiple LLMs.
FPAs transfer across different model families.
FPAs remain effective despite robust prompts.
Abstract
Large Language Models (LLMs) are increasingly trusted to perform automated code review and static analysis at scale, supporting tasks such as vulnerability detection, summarization, and refactoring. In this paper, we identify and exploit a critical vulnerability in LLM-based code analysis: an abstraction bias that causes models to overgeneralize familiar programming patterns and overlook small, meaningful bugs. Adversaries can exploit this blind spot to hijack the control flow of the LLM's interpretation with minimal edits and without affecting actual runtime behavior. We refer to this attack as a Familiar Pattern Attack (FPA). We develop a fully automated, black-box algorithm that discovers and injects FPAs into target code. Our evaluation shows that FPAs are not only effective against basic and reasoning models, but are also transferable across model families (OpenAI, Anthropic,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
