Reasoning Shapes Alignment: Investigating Cultural Alignment in Large Reasoning Models with Cultural Norms
Yuhang Wang, Yanxu Zhu, Jitao Sang

TL;DR
This paper introduces the CNCA framework, enabling large reasoning models to align with cultural norms by mining and applying cultural values, thereby enhancing their ability to reflect diverse human cultures.
Contribution
The paper proposes three methods for mining cultural norms and two paradigms for integrating them into reasoning models, advancing cultural alignment techniques.
Findings
Models benefit more from cultural norm mining with stronger reasoning abilities.
Explicit context integration improves cultural alignment.
Fine-tuning with cultural norms enhances model understanding.
Abstract
The advanced reasoning capabilities of Large Reasoning Models enable them to thoroughly understand and apply safety policies through deliberate thought processes, thereby improving the models' safety. Beyond safety, these models must also be able to reflect the diverse range of human values across various cultures. This paper presents the Cultural Norm-based Cultural Alignment (CNCA) framework, which enables models to leverage their powerful reasoning ability to align with cultural norms. Specifically, we propose three methods to automatically mine cultural norms from limited survey data and explore ways to effectively utilize these norms for improving cultural alignment. Two alignment paradigms are examined: an in-context alignment method, where cultural norms are explicitly integrated into the user context, and a fine-tuning-based method, which internalizes norms through enhanced…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Intelligent Tutoring Systems and Adaptive Learning · Topic Modeling
