Not Your Typical Sycophant: The Elusive Nature of Sycophancy in Large Language Models
Shahar Ben Natan, Oren Tsur

TL;DR
This paper introduces a novel, bias-mitigating framework to evaluate sycophancy in large language models using a zero-sum game approach, revealing complex interactions between sycophantic tendencies and recency bias.
Contribution
It presents a new evaluation method for sycophancy in LLMs that reduces bias, and uncovers how models exhibit moral remorse and recency bias interactions.
Findings
Models show sycophantic tendencies in common settings.
Claude and Mistral exhibit moral remorse and over-compensate when harming third parties.
Sycophancy interacts with recency bias, amplifying agreement when opinions are last.
Abstract
We propose a novel way to evaluate sycophancy of LLMs in a direct and neutral way, mitigating various forms of uncontrolled bias, noise, or manipulative language, deliberately injected to prompts in prior works. A key novelty in our approach is the use of LLM-as-a-judge, evaluation of sycophancy as a zero-sum game in a bet setting. Under this framework, sycophancy serves one individual (the user) while explicitly incurring cost on another. Comparing four leading models - Gemini 2.5 Pro, ChatGpt 4o, Mistral-Large-Instruct-2411, and Claude Sonnet 3.7 - we find that while all models exhibit sycophantic tendencies in the common setting, in which sycophancy is self-serving to the user and incurs no cost on others, Claude and Mistral exhibit "moral remorse" and over-compensate for their sycophancy in case it explicitly harms a third party. Additionally, we observed that all models are biased…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in Service Interactions · Topic Modeling · Artificial Intelligence in Healthcare and Education
