Metric-Fair Prompting: Treating Similar Samples Similarly
Jing Wang, Jie Shen, Xing Niu, Tong Zhang, Jeremy Weiss

TL;DR
This paper proposes Metric-Fair Prompting, a fairness-aware method for guiding large language models to treat similar instances consistently, improving accuracy in clinical multiple-choice question answering.
Contribution
It introduces a novel prompting framework that enforces metric-fairness constraints, ensuring similar inputs receive similar outputs, and demonstrates its effectiveness on medical question answering tasks.
Findings
Improves LLM accuracy on MedQA benchmark
Enforces fairness by Lipschitz-style constraints
Enhances decision consistency for similar samples
Abstract
We introduce \emph{Metric-Fair Prompting}, a fairness-aware prompting framework that guides large language models (LLMs) to make decisions under metric-fairness constraints. In the application of multiple-choice medical question answering, each {(question, option)} pair is treated as a binary instance with label (correct) or (incorrect). To promote {individual fairness}~--~treating similar instances similarly~--~we compute question similarity using NLP embeddings and solve items in \emph{joint pairs of similar questions} rather than in isolation. The prompt enforces a global decision protocol: extract decisive clinical features, map each \((\text{question}, \text{option})\) to a score that acts as confidence, and impose a Lipschitz-style constraint so that similar inputs receive similar scores and, hence, consistent outputs. Evaluated on the {MedQA (US)} benchmark,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Multimodal Machine Learning Applications
