ALIGN: Word Association Learning for Cultural Alignment in Large Language Models

Chunhua Liu; Kabir Manandhar Shrestha; Sukai Huang

arXiv:2508.13426·cs.CL·December 16, 2025

ALIGN: Word Association Learning for Cultural Alignment in Large Language Models

Chunhua Liu, Kabir Manandhar Shrestha, Sukai Huang

PDF

Open Access

TL;DR

This paper presents a cost-effective method to improve cultural alignment in large language models by fine-tuning them on native speakers' word-association norms, leading to significant cultural value shifts.

Contribution

Introducing a cognitively grounded fine-tuning approach using word association data to enhance cultural alignment in large language models.

Findings

01

Significant lexical alignment improvements (16-20% English, 43-165% Mandarin).

02

High-level cultural value shifts observed, especially in diverging questions.

03

Models with 7-8B parameters match or outperform larger baselines.

Abstract

Large language models (LLMs) exhibit cultural bias from overrepresented viewpoints in training data, yet cultural alignment remains a challenge due to limited cultural knowledge and a lack of exploration into effective learning approaches. We introduce a cost-efficient and cognitively grounded method: fine-tuning LLMs on native speakers' word-association norms, leveraging cognitive psychology findings that such associations capture cultural knowledge. Using word association datasets from native speakers in the US (English) and China (Mandarin), we train Llama-3.1-8B and Qwen-2.5-7B via supervised fine-tuning and preference optimization. We evaluate models' cultural alignment through a two-tier evaluation framework that spans lexical associations and cultural value alignment using the World Values Survey. Results show significant improvements in lexical alignment (16-20% English, 43-165%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling