Optimizing Class-Level Probability Reweighting Coefficients for Equitable Prompting Accuracy
Ruixi Lin, Yang You

TL;DR
This paper introduces a post-hoc probability reweighting method for large language models to improve class fairness and accuracy by directly optimizing for fairness metrics without altering the model's internal weights.
Contribution
It proposes a novel discrete optimization approach with a new fairness metric, COBias, and an efficient metaheuristic solution for equitable LLM inference.
Findings
61% reduction in COBias disparity
18% increase in overall accuracy
Robust generalization across prompt configurations
Abstract
Even as we engineer LLMs for alignment and safety, they often uncover biases from pre-training data's statistical regularities (from disproportionate co-occurrences to stereotypical associations mirroring human cognitive biases). This leads to persistent, uneven class accuracy in classification and QA. Such per-class accuracy disparities are not inherently resolved by architectural/training evolutions or data scaling, making post-hoc correction essential for equitable performance. To mitigate LLM class accuracy imbalance, we develop a post-hoc probability reweighting method that directly optimizes for non-differentiable performance-driven and fairness-aligned metrics, through a novel COBias metric that highlights disparities in class accuracies. This post-hoc bias mitigation method is grounded in discrete optimization with nonlinear integer programming (NIP) objectives and an efficient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
MethodsBalanced Selection
