Whose Facts Win? LLM Source Preferences under Knowledge Conflicts
Jakob Schuster, Vagrant Gautam, Katja Markert

TL;DR
This paper investigates how large language models prefer different information sources during knowledge conflicts, revealing biases towards institutional sources and proposing a method to reduce repetition bias.
Contribution
It introduces a novel framework for studying source preferences in LLMs, demonstrates biases towards credible sources, and proposes a technique to mitigate repetition bias.
Findings
LLMs prefer institutionally-corroborated sources over social media.
Repetition of less credible sources can reverse source preferences.
Proposed method reduces repetition bias by up to 79.2% while preserving preferences.
Abstract
As large language models (LLMs) are more frequently used in retrieval-augmented generation pipelines, it is increasingly relevant to study their behavior under knowledge conflicts. Thus far, the role of the source of the retrieved information has gone unexamined. We address this gap with a novel framework to investigate how source preferences affect LLM resolution of inter-context knowledge conflicts in English, motivated by interdisciplinary research on credibility. By using synthetic sources, we study preferences for different types of sources without inheriting the biases of specific real-world sources. With a comprehensive, tightly-controlled evaluation of 13 open-weight LLMs, we find that LLMs prefer institutionally-corroborated information (e.g., government or newspaper sources) over information from people and social media. However, these source preferences can be reversed by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
