CLM-Bench: Benchmarking and Analyzing Cross-lingual Misalignment of LLMs in Knowledge Editing

Yucheng Hu; Wei Zhou; Juesi Xiao

arXiv:2601.17397·cs.CL·January 27, 2026

CLM-Bench: Benchmarking and Analyzing Cross-lingual Misalignment of LLMs in Knowledge Editing

Yucheng Hu, Wei Zhou, Juesi Xiao

PDF

Open Access 1 Video

TL;DR

This paper introduces CLM-Bench, a culturally native Chinese-centric benchmark for evaluating multilingual knowledge editing in LLMs, revealing significant cross-lingual misalignment and limitations of current editing methods.

Contribution

We propose CLM-Bench, a culturally aware benchmark built with native Chinese data, and analyze cross-lingual misalignment in LLM knowledge editing through geometric layer-wise representation analysis.

Findings

01

Significant cross-lingual misalignment in knowledge edits.

02

Layer-wise representations for different languages are nearly orthogonal.

03

Mixed-lingual editing shows linear additivity of edit vectors.

Abstract

Knowledge Editing (KE) has emerged as a promising paradigm for updating facts in Large Language Models (LLMs) without retraining. However, progress in Multilingual Knowledge Editing (MKE) is currently hindered by biased evaluation frameworks. We observe that existing MKE benchmarks are typically constructed by mechanically translating English-centric datasets into target languages (e.g., English-to-Chinese). This approach introduces translation artifacts and neglects culturally specific entities native to the target language, failing to reflect the true knowledge distribution of LLMs. To address this, we propose CLM-Bench, a culture-aware benchmark constructed using a native Chinese-first methodology. We curate 1,010 high-quality CounterFact pairs rooted in Chinese cultural contexts and align them with English counterparts. Using CLM-Bench, we conduct extensive experiments on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

CLM-Bench: Benchmarking and Analyzing Cross-lingual Misalignment of LLMs in Knowledge Editing· underline

Taxonomy

TopicsTopic Modeling · Advanced Graph Neural Networks · Natural Language Processing Techniques