The Synthetic Web: Adversarially-Curated Mini-Internets for Diagnosing Epistemic Weaknesses of Language Agents
Shrey Shah, Levent Ozgur

TL;DR
This paper introduces a synthetic web benchmark to evaluate how language agents handle adversarial misinformation, revealing significant vulnerabilities and providing a tool for developing more robust, epistemically humble models.
Contribution
The paper presents a novel, procedurally generated benchmark environment for testing language agents against adversarial ranking attacks, enabling causal analysis of misinformation effects.
Findings
Models experience catastrophic accuracy drops under adversarial misinformation.
Current models show minimal search escalation despite severe miscalibration.
The benchmark exposes fundamental limitations in handling conflicting information.
Abstract
Language agents increasingly act as web-enabled systems that search, browse, and synthesize information from diverse sources. However, these sources can include unreliable or adversarial content, and the robustness of agents to adversarial ranking - where misleading information appears prominently in search results - remains poorly understood. Existing benchmarks evaluate functional navigation or static factuality but cannot causally isolate this vulnerability, and current mitigation strategies for retrieval-augmented generation remain largely untested under such conditions. We introduce Synthetic Web Benchmark, a procedurally generated environment comprising thousands of hyperlinked articles with ground-truth labels for credibility and factuality, process-level interaction traces, and contamination filtering to eliminate training-data leakage. By injecting a single high-plausibility…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Topic Modeling · Ethics and Social Impacts of AI
