FairJudge: MLLM Judging for Social Attributes and Prompt Image Alignment

Zahraa Al Sahili; Maryam Fetanat; Maimuna Nowaz; Ioannis Patras; Matthew Purver

arXiv:2510.22827·cs.CV·November 20, 2025

FairJudge: MLLM Judging for Social Attributes and Prompt Image Alignment

Zahraa Al Sahili, Maryam Fetanat, Maimuna Nowaz, Ioannis Patras, Matthew Purver

PDF

TL;DR

FairJudge introduces a novel, explainable evaluation protocol using instruction-following multimodal LLMs to assess social attribute fairness and prompt-image alignment, improving accountability and reproducibility.

Contribution

It presents a lightweight, explanation-oriented judging framework that enhances fairness evaluation in text-to-image systems by grounding judgments in visible content and enabling abstention.

Findings

01

Outperforms contrastive and face-centric baselines in demographic prediction.

02

Improves mean alignment scores across multiple datasets.

03

Maintains high profession accuracy while assessing social attributes.

Abstract

Text-to-image (T2I) systems lack simple, reproducible ways to evaluate how well images match prompts and how models treat social attributes. Common proxies -- face classifiers and contrastive similarity -- reward surface cues, lack calibrated abstention, and miss attributes only weakly visible (for example, religion, culture, disability). We present FairJudge, a lightweight protocol that treats instruction-following multimodal LLMs as fair judges. It scores alignment with an explanation-oriented rubric mapped to [-1, 1]; constrains judgments to a closed label set; requires evidence grounded in the visible content; and mandates abstention when cues are insufficient. Unlike CLIP-only pipelines, FairJudge yields accountable, evidence-aware decisions; unlike mitigation that alters generators, it targets evaluation fairness. We evaluate gender, race, and age on FairFace, PaTA, and FairCoT;…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.