CALM: Unleashing the Cross-Lingual Self-Aligning Ability of Language   Model Question Answering

Yumeng Wang; Zhiyuan Fan; Qingyun Wang; May Fung; Heng Ji

arXiv:2501.18457·cs.CL·February 11, 2025

CALM: Unleashing the Cross-Lingual Self-Aligning Ability of Language Model Question Answering

Yumeng Wang, Zhiyuan Fan, Qingyun Wang, May Fung, Heng Ji

PDF

Open Access 1 Video

TL;DR

This paper introduces CALM, a method that improves cross-lingual consistency in language models by aligning knowledge across languages using self-selected responses and preference optimization, enhancing multilingual question answering.

Contribution

CALM is a novel approach that leverages self-alignment and preference optimization to improve cross-lingual knowledge consistency in language models.

Findings

01

CALM improves accuracy in multilingual QA tasks.

02

Increasing languages in training enhances model consistency.

03

CALM outperforms baseline methods in zero-shot settings.

Abstract

Large Language Models (LLMs) are pretrained on extensive multilingual corpora to acquire both language-specific cultural knowledge and general knowledge. Ideally, while LLMs should provide consistent responses to culture-independent questions across languages, we observe significant performance disparities. To address this, we explore the Cross-Lingual Self-Aligning ability of Language Models (CALM) to align knowledge across languages. Specifically, for a given question, we sample multiple responses across different languages and select the most self-consistent response as the target, leaving the remaining responses as negative examples. We then employ direct preference optimization (DPO) to align the model's knowledge across different languages. Evaluations on the MEDQA and X-CSQA datasets demonstrate CALM's effectiveness in enhancing cross-lingual knowledge question answering, both in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

CALM: Unleashing the Cross-Lingual Self-Aligning Ability of Language Model Question Answering· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems

MethodsALIGN