The Impact of Ideological Discourses in RAG: A Case Study with COVID-19 Treatments

Elmira Salari (1); Maria Claudia Nunes Delfino (2); Hazem Amamou (3); Jos\'e Victor de Souza (3); Shruti Kshirsagar (1); Alan Davoust (4); and Anderson Avila (3) ((1) Wichita State University; (2) Pontif\'icia Universidade Cat\'olica de S\~ao Paulo; (3) Institut national de la recherche scientifique; (4) Universit\'e du Qu\'ebec en Outaouais)

arXiv:2603.14838·cs.CL·March 17, 2026

The Impact of Ideological Discourses in RAG: A Case Study with COVID-19 Treatments

Elmira Salari (1), Maria Claudia Nunes Delfino (2), Hazem Amamou (3), Jos\'e Victor de Souza (3), Shruti Kshirsagar (1), Alan Davoust (4), and Anderson Avila (3) ((1) Wichita State University, (2) Pontif\'icia Universidade Cat\'olica de S\~ao Paulo

PDF

Open Access

TL;DR

This study investigates how ideological texts retrieved during RAG influence LLM outputs, revealing that external ideological sources significantly shape model responses and emphasizing the need to address ideological bias and manipulation risks.

Contribution

It introduces a corpus linguistics framework using LMDA to identify ideologies in RAG, and demonstrates how retrieved ideological texts affect LLM responses in the context of COVID-19 treatments.

Findings

01

LLMs' responses align more with external ideological texts.

02

Enhanced prompts increase ideological influence on responses.

03

Highlighting risks of ideological bias and manipulation in RAG.

Abstract

This paper studies the impact of retrieved ideological texts on the outputs of large language models (LLMs). While interest in understanding ideology in LLMs has recently increased, little attention has been given to this issue in the context of Retrieval-Augmented Generation (RAG). To fill this gap, we design an external knowledge source based on ideological loaded texts about COVID-19 treatments. Our corpus is based on 1,117 academic articles representing discourses about controversial and endorsed treatments for the disease. We propose a corpus linguistics framework, based on Lexical Multidimensional Analysis (LMDA), to identify the ideologies within the corpus. LLMs are tasked to answer questions derived from three identified ideological dimensions, and two types of contextual prompts are adopted: the first comprises the user question and ideological texts; and the second contains…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods · Ethics and Social Impacts of AI