Assistant-Guided Mitigation of Teacher Preference Bias in LLM-as-a-Judge

Zhuo Liu; Moxin Li; Xun Deng; Qifan Wang; Fuli Feng

arXiv:2505.19176·cs.CL·September 19, 2025

Assistant-Guided Mitigation of Teacher Preference Bias in LLM-as-a-Judge

Zhuo Liu, Moxin Li, Xun Deng, Qifan Wang, Fuli Feng

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces AGDe-Judge, a three-stage framework that uses an assistant model to mitigate teacher preference bias in LLM-based evaluation, improving fairness without sacrificing performance.

Contribution

We propose a novel three-stage framework incorporating an unbiased assistant model to effectively reduce teacher bias in LLM evaluation models.

Findings

01

AGDe-Judge reduces teacher preference bias significantly.

02

Maintains strong evaluation performance across six benchmarks.

03

Demonstrates effectiveness of assistant models in debiasing.

Abstract

LLM-as-a-Judge employs large language models (LLMs), such as GPT-4, to evaluate the quality of LLM-generated responses, gaining popularity for its cost-effectiveness and strong alignment with human evaluations. However, training proxy judge models using evaluation data generated by powerful teacher models introduces a critical yet previously overlooked issue: teacher preference bias, where the proxy judge model learns a biased preference for responses from the teacher model. To tackle this problem, we propose a novel setting that incorporates an additional assistant model, which is not biased toward the teacher model's responses, to complement the training data. Building on this setup, we introduce AGDe-Judge, a three-stage framework designed to debias from both the labels and feedbacks in the training data. Extensive experiments demonstrate that AGDe-Judge effectively reduces teacher…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

liuz233/agde-judge
noneOfficial

Videos

Assistant-Guided Mitigation of Teacher Preference Bias in LLM-as-a-Judge· underline

Taxonomy

TopicsLegal Education and Practice Innovations · Dispute Resolution and Class Actions · Artificial Intelligence in Law

MethodsAttention Is All You Need · Linear Layer · Dense Connections · Softmax · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Label Smoothing · Multi-Head Attention · Layer Normalization · Byte Pair Encoding