Stealthy LLM-Driven Data Poisoning Attacks Against Embedding-Based   Retrieval-Augmented Recommender Systems

Fatemeh Nazary; Yashar Deldjoo; Tommaso Di Noia; Eugenio Di Sciascio

arXiv:2505.05196·cs.IR·May 9, 2025

Stealthy LLM-Driven Data Poisoning Attacks Against Embedding-Based Retrieval-Augmented Recommender Systems

Fatemeh Nazary, Yashar Deldjoo, Tommaso Di Noia, Eugenio Di Sciascio

PDF

Open Access

TL;DR

This paper investigates how small, subtle modifications to item descriptions in retrieval-augmented recommender systems can be exploited to unfairly promote or demote items, revealing significant vulnerabilities and the need for improved defenses.

Contribution

The study formalizes stealthy data poisoning attacks on RAG-based recommenders and demonstrates their effectiveness through experiments, highlighting critical security gaps.

Findings

01

Small token modifications can significantly alter item rankings.

02

Stealthy attacks evade naive detection methods.

03

Robust textual consistency checks are necessary for defense.

Abstract

We present a systematic study of provider-side data poisoning in retrieval-augmented recommender systems (RAG-based). By modifying only a small fraction of tokens within item descriptions -- for instance, adding emotional keywords or borrowing phrases from semantically related items -- an attacker can significantly promote or demote targeted items. We formalize these attacks under token-edit and semantic-similarity constraints, and we examine their effectiveness in both promotion (long-tail items) and demotion (short-head items) scenarios. Our experiments on MovieLens, using two large language model (LLM) retrieval modules, show that even subtle attacks shift final rankings and item exposures while eluding naive detection. The results underscore the vulnerability of RAG-based pipelines to small-scale metadata rewrites and emphasize the need for robust textual consistency checks and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Graph Neural Networks · Recommender Systems and Techniques