User-Assistant Bias in LLMs

Xu Pan; Jingxuan Fan; Zidi Xiong; Ely Hahami; Jorin Overwiening; Ziqian Xie

arXiv:2508.15815·cs.CL·April 21, 2026

User-Assistant Bias in LLMs

Xu Pan, Jingxuan Fan, Zidi Xiong, Ely Hahami, Jorin Overwiening, Ziqian Xie

PDF

TL;DR

This paper investigates how role tags in large language models can introduce biases, proposing benchmarks and methods to diagnose and control such biases across various models and datasets.

Contribution

It formalizes user-assistant bias, introduces a benchmark for evaluation, and demonstrates how fine-tuning and preference optimization can control this bias.

Findings

01

Most instruction-tuned models exhibit strong user bias.

02

Base and reasoning models are close to neutral in bias.

03

Fine-tuning methods can amplify or reduce user-assistant bias.

Abstract

Modern large language models (LLMs) are typically trained and deployed using structured role tags (e.g. system, user, assistant, tool) that explicitly mark the source of each piece of context. While these tags are essential for instruction following and controllability, asymmetries in the training data associated with different role tags can potentially introduce inductive biases. In this paper, we study this phenomenon by formalizing user-assistant bias, defined as the tendency of an LLM to preferentially rely on information from either the user or assistant role when they provide incompatible information about the same entity in the context history. We introduce a task-agnostic benchmark UserAssist and evaluate such bias in 52 frontier models. We observe that most of the instruction-tuned models exhibit strong user bias, whereas base and reasoning models are close to neutral. Using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.