Search-Induced Issues in Web-Augmented LLM Code Generation: Detecting and Repairing Error-Inducing Pages

Guoqing Wang; Zeyu Sun; Xiaofei Xie; Yizhou Chen; Yanchao Tan; Yifan Zhao; Dan Hao

arXiv:2603.26091·cs.SE·March 30, 2026

Search-Induced Issues in Web-Augmented LLM Code Generation: Detecting and Repairing Error-Inducing Pages

Guoqing Wang, Zeyu Sun, Xiaofei Xie, Yizhou Chen, Yanchao Tan, Yifan Zhao, Dan Hao

PDF

TL;DR

This paper studies the vulnerability of web-augmented LLMs to Search-Induced Issues (SII), proposing Sherlock, an automated framework to detect, debug, and repair such issues to improve code generation reliability.

Contribution

It introduces Sherlock, a novel automated system that detects, debugs, and repairs Search-Induced Issues in web-augmented LLM code generation at scale.

Findings

01

Sherlock detects EIPs with up to 95% F1 score.

02

Sherlock repairs 71% to 100% of affected generations.

03

All evaluated web-augmented LLMs are vulnerable to SII.

Abstract

Web-augmented large language models (LLMs) offer promising capabilities for automatic code generation. However, integrating live web search exposes models to unreliable or malicious content, leading to Search-Induced Issues (SII), a novel failure mode in which external pages mislead LLMs into producing incorrect code. This paper presents a comprehensive empirical study of the prevalence and impact of SII across three commercial search APIs and six advanced LLMs. Our analysis reveals that all evaluated web-augmented LLMs are vulnerable to SII, with root causes arising from either misaligned specifications or flawed code implementations in the searched Error-Inducing Pages (EIPs). To address this challenge, we propose Sherlock, an automated framework that enables LLM service providers to proactively safeguard web-augmented generation systems at scale. Sherlock operates as a continuous…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.