Pride and Prejudice: LLM Amplifies Self-Bias in Self-Refinement
Wenda Xu, Guanglei Zhu, Xuandong Zhao, Liangming Pan, Lei Li, William, Yang Wang

TL;DR
This paper investigates how large language models exhibit self-bias in self-refinement, which can both improve and degrade performance, and proposes methods to mitigate this bias for better outcomes.
Contribution
The paper formally defines LLM self-bias, analyzes its prevalence across models and tasks, and identifies strategies like larger models and external feedback to reduce bias.
Findings
Self-bias is prevalent across all examined LLMs and tasks.
Self-refinement amplifies existing biases despite improving fluency.
Larger models and external feedback can significantly reduce self-bias.
Abstract
Recent studies show that large language models (LLMs) improve their performance through self-feedback on certain tasks while degrade on others. We discovered that such a contrary is due to LLM's bias in evaluating their own output. In this paper, we formally define LLM's self-bias - the tendency to favor its own generation - using two statistics. We analyze six LLMs (GPT-4, GPT-3.5, Gemini, LLaMA2, Mixtral and DeepSeek) on translation, constrained text generation, and mathematical reasoning tasks. We find that self-bias is prevalent in all examined LLMs across multiple languages and tasks. Our analysis reveals that while the self-refine pipeline improves the fluency and understandability of model outputs, it further amplifies self-bias. To mitigate such biases, we discover that larger model size and external feedback with accurate assessment can significantly reduce bias in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Residual Connection · Softmax · Layer Normalization · Byte Pair Encoding · Adam · Attention Dropout · Weight Decay
