RAGRank: Using PageRank to Counter Poisoning in CTI LLM Pipelines

Austin Jia; Avaneesh Ramesh; Zain Shamsi; Daniel Zhang; and Alex Liu

arXiv:2510.20768·cs.CR·December 17, 2025

RAGRank: Using PageRank to Counter Poisoning in CTI LLM Pipelines

Austin Jia, Avaneesh Ramesh, Zain Shamsi, Daniel Zhang, and Alex Liu

PDF

Open Access

TL;DR

This paper introduces RAGRank, a method that enhances retrieval-augmented generation systems in cyber threat intelligence by applying PageRank to identify and prioritize credible sources, thereby mitigating poisoning attacks.

Contribution

The paper proposes using PageRank-based source credibility scoring to improve the robustness of RAG systems against poisoning in CTI contexts, demonstrating effectiveness on standard and CTI datasets.

Findings

01

PageRank reduces influence of malicious documents

02

Improves trustworthiness of retrieved content

03

Effective on CTI-specific data

Abstract

Retrieval-Augmented Generation (RAG) has emerged as the dominant architectural pattern to operationalize Large Language Model (LLM) usage in Cyber Threat Intelligence (CTI) systems. However, this design is susceptible to poisoning attacks, and previously proposed defenses can fail for CTI contexts as cyber threat information is often completely new for emerging attacks, and sophisticated threat actors can mimic legitimate formats, terminology, and stylistic conventions. To address this issue, we propose that the robustness of modern RAG defenses can be accelerated by applying source credibility algorithms on corpora, using PageRank as an example. In our experiments, we demonstrate quantitatively that our algorithm applies a lower authority score to malicious documents while promoting trusted content, using the standardized MS MARCO dataset. We also demonstrate proof-of-concept…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Authorship Attribution and Profiling · Hate Speech and Cyberbullying Detection