Can Large Language Models Make Everyone Happy?

Usman Naseem; Gautam Siddharth Kashyap; Ebad Shabbir; Sushant Kumar Ray; Abdullah Mohammad; Rafiq Ali

arXiv:2602.11091·cs.CL·February 12, 2026

Can Large Language Models Make Everyone Happy?

Usman Naseem, Gautam Siddharth Kashyap, Ebad Shabbir, Sushant Kumar Ray, Abdullah Mohammad, Rafiq Ali

PDF

Open Access

TL;DR

This paper introduces MisAlign-Profile, a comprehensive benchmark to measure and analyze the complex trade-offs in aligning large language models across safety, value, and cultural dimensions, addressing limitations of existing isolated benchmarks.

Contribution

It presents MISALIGNTRADE, a novel dataset and benchmark for systematically evaluating cross-dimensional misalignment trade-offs in LLMs, incorporating diverse domains and semantic types.

Findings

01

LLMs show 12%-34% misalignment trade-offs across dimensions.

02

Existing benchmarks lack cross-dimensional analysis.

03

MisAlign-Profile enables systematic evaluation of alignment trade-offs.

Abstract

Misalignment in Large Language Models (LLMs) refers to the failure to simultaneously satisfy safety, value, and cultural dimensions, leading to behaviors that diverge from human expectations in real-world settings where these dimensions must co-occur. Existing benchmarks, such as SAFETUNEBED (safety-centric), VALUEBENCH (value-centric), and WORLDVIEW-BENCH (culture-centric), primarily evaluate these dimensions in isolation and therefore provide limited insight into their interactions and trade-offs. More recent efforts, including MIB and INTERPRETABILITY BENCHMARK-based on mechanistic interpretability, offer valuable perspectives on model failures; however, they remain insufficient for systematically characterizing cross-dimensional trade-offs. To address these gaps, we introduce MisAlign-Profile, a unified benchmark for measuring misalignment trade-offs inspired by mechanistic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods · Explainable Artificial Intelligence (XAI)