An Empirical Survey of Model Merging Algorithms for Social Bias Mitigation

Daiki Shirafuji; Tatsuhiko Saito; Yasutomo Kimura

arXiv:2512.02689·cs.CL·December 3, 2025

An Empirical Survey of Model Merging Algorithms for Social Bias Mitigation

Daiki Shirafuji, Tatsuhiko Saito, Yasutomo Kimura

PDF

Open Access

TL;DR

This paper empirically compares seven model merging algorithms for social bias mitigation in large language models, revealing trade-offs between bias reduction and task performance, and identifying the most balanced methods.

Contribution

It provides the first comprehensive empirical evaluation of multiple model merging algorithms for bias mitigation across diverse LLMs and datasets.

Findings

01

SLERP at moderate weights offers the best bias-performance balance

02

Bias mitigation often reduces accuracy on reading and reasoning tasks

03

Linear, SLERP, and Nearswap are consistently effective in bias reduction

Abstract

Large language models (LLMs) are known to inherit and even amplify societal biases present in their pre-training corpora, threatening fairness and social trust. To address this issue, recent work has explored ``editing'' LLM parameters to mitigate social bias with model merging approaches; however, there is no empirical comparison. In this work, we empirically survey seven algorithms: Linear, Karcher Mean, SLERP, NuSLERP, TIES, DELLA, and Nearswap, applying 13 open weight models in the GPT, LLaMA, and Qwen families. We perform a comprehensive evaluation using three bias datasets (BBQ, BOLD, and HONEST) and measure the impact of these techniques on LLM performance in downstream tasks of the SuperGLUE benchmark. We find a trade-off between bias reduction and downstream performance: methods achieving greater bias mitigation degrade accuracy, particularly on tasks requiring reading…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI)