Peer Identity Bias in Multi-Agent LLM Evaluation: An Empirical Study Using the TRUST Democratic Discourse Analysis Pipeline
Juergen Dietrich

TL;DR
This empirical study reveals that identity bias in multi-agent LLM evaluation is only accurately measurable with full-pipeline anonymization, highlighting the importance of ensemble heterogeneity for robust bias mitigation.
Contribution
First systematic measurement of identity-dependent scoring bias across all exposure channels in TRUST, demonstrating the necessity of full-pipeline anonymization for valid bias assessment.
Findings
Single-channel anonymization produces near-zero bias effects.
Homogeneous ensembles amplify identity-driven sycophancy.
Heterogeneous ensembles are more robust and achieve higher consensus.
Abstract
The TRUST democratic discourse analysis pipeline exposes its large language model (LLM) components to peer model identity through multiple structural channels -- a design feature whose bias implications have not previously been empirically tested. We provide the first systematic measurement of identity-dependent scoring bias across all active identity exposure channels in TRUST, crossing four model families with two anonymization scopes across 30 political statements. The central finding is that single-channel anonymization produces near-zero bias effects, because individual channels act in opposite directions and cancel each other out -- a result that would lead an evaluator to conclude that identity bias is absent when it is not. Only full-pipeline anonymization reveals the true pattern: homogeneous ensembles amplify identity-driven sycophancy when model identity is fully visible,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
