Oracle Poisoning: Corrupting Knowledge Graphs to Weaponise AI Agent Reasoning

Ben Kereopa-Yorke; Guillermo Diaz; Holly Wright; Reagan Johnston; Ron F. Del Rosario; Timothy Lynar

arXiv:2605.09822·cs.CR·May 12, 2026

Oracle Poisoning: Corrupting Knowledge Graphs to Weaponise AI Agent Reasoning

Ben Kereopa-Yorke, Guillermo Diaz, Holly Wright, Reagan Johnston, Ron F. Del Rosario, Timothy Lynar

PDF

TL;DR

This paper introduces Oracle Poisoning, a novel attack that corrupts knowledge graphs used by AI agents, demonstrating its effectiveness on a large-scale production system and analyzing defenses and transferability.

Contribution

It provides the first empirical demonstration of knowledge graph poisoning against a production-scale agentic system and analyzes attack effectiveness, defenses, and transferability.

Findings

01

All tested models trust poisoned data at 100% with moderate attacker skill.

02

Trust drops significantly under open-ended prompts, showing prompt framing as a confound.

03

Inline evaluation produces false negatives, affecting trust assessment.

Abstract

We define Oracle Poisoning, an attack class in which an adversary corrupts a structured knowledge graph that AI agents query at runtime via tool-use protocols, causing incorrect conclusions through correct reasoning. Unlike prompt injection, Oracle Poisoning manipulates the data agents reason over, not their instructions. We demonstrate six attack scenarios against a production 42-million-node code knowledge graph, providing the first empirical demonstration of knowledge graph poisoning against a production-scale agentic system, distinct from CTI embedding poisoning. Primary evaluation uses real SDK tool-use across nine models from three providers (N=30 per model), where models autonomously invoke a graph query tool and reason from results. The result is unambiguous: every tested model trusts poisoned data at 100% at moderate attacker sophistication(L2), with 269 valid trials (of 270)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.