Mitigating Cross-Lingual Cultural Inconsistencies in LLMs via Consensus-Driven Preference Optimisation
Lucas Resck, Isabelle Augenstein, Anna Korhonen

TL;DR
This paper introduces a new metric and a framework to reduce cross-lingual cultural inconsistencies in multilingual large language models, especially for lower-resource languages.
Contribution
It proposes Singleton Fleiss's κ_S for robust inconsistency measurement and C-3PO, a preference optimization method to improve cultural consistency across languages.
Findings
C-3PO increases cultural consistency by up to 0.10 in κ_S score.
Lower-resource languages are more affected by cultural inconsistencies.
Early-layer representations reveal implicit cultural personalization in MLLMs.
Abstract
Despite their impressive capabilities, multilingual large language models (MLLMs) frequently exhibit inconsistent behaviour when the prompt's language changes. While such adaptation is generally desirable, it becomes a critical failure when a user's identity is explicitly defined. For instance, given a fixed British persona and an ambiguous everyday knowledge query about literature, the prompt's language frequently overwrites the system persona -- yielding Shakespeare in English but Cervantes in Spanish. To robustly quantify this Cross-lingual Cultural Inconsistency, we introduce Singleton Fleiss's , a metric mathematically resilient to hallucinations. For mitigation, we propose Cross-lingual Cultural Consistent Preference Optimisation (C-3PO), a consensus-driven alignment framework. C-3PO achieves up to a 0.10-point absolute increase in over unaligned models,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
