Exploring LLM biases to manipulate AI search overview
Roman Smirnov

TL;DR
This paper investigates biases in LLM-based web search overview systems, demonstrating that these biases can be exploited through reinforcement learning to manipulate search result selections and highlighting safety concerns with such manipulations.
Contribution
It provides empirical evidence that biases in LLM overview systems can be exploited to influence search results, and analyzes safety risks associated with manipulation attacks.
Findings
Biases exist in LLM overview systems affecting source selection.
Reinforcement learning can optimize snippets to manipulate results.
Selection is driven by comparative advantages, not absolute quality.
Abstract
Modern large language models (LLMs) are used in many business applications in general, and specifically in web search systems and applications that generate overviews of search results - LLM Overview systems. Such systems are using an LLM to select most relevant sources from search results and generate an answer to the user's query. It is known from many studies that LLMs have different biases, in LLM Overview application both the source selection and answer generation stages may be affected by the biases of LLMs (here we are focusing mainly on the selection stage). This research is focused on investigating the presence of the biases in LLM Overview systems and on biases exploitation to manipulate LLM Overview results. Here we train a small language model using reinforcement learning to rewrite search snippets to increase their likelihood of being preferred by an LLM Overview. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
