Language Specific Knowledge: Do Models Know Better in X than in English?

Ishika Agarwal; Nimet Beyza Bozdag; Nisval Patel; Dilek Hakkani-T\"ur

arXiv:2505.14990·cs.CL·April 27, 2026

Language Specific Knowledge: Do Models Know Better in X than in English?

Ishika Agarwal, Nimet Beyza Bozdag, Nisval Patel, Dilek Hakkani-T\"ur

PDF

1 Repo

TL;DR

This paper demonstrates that selecting the optimal language query can significantly enhance multilingual language models' question-answering performance, revealing non-intuitive language knowledge distributions.

Contribution

It introduces the concept of Language Specific Knowledge (LSK), the problem of language selection for improved QA, and proposes baselines including the LSKExtractor method.

Findings

01

Language selection can improve model performance significantly.

02

Models know different information better in languages other than English.

03

Performance varies across datasets and models based on language choice.

Abstract

Often, multilingual language models are trained with the objective to map semantically similar content (in different languages) in the same latent space. In this paper, we show a nuance in this training objective, and find that by changing the language of the input query, we can improve the question answering ability of language models. We make two main contributions. First, we introduce the term Language Specific Knowledge (LSK) to denote queries that are best answered in an ``expert language'' for a given LLM, thereby enhancing its question-answering ability. We introduce the problem of language selection -- for some queries, language models can perform better when queried in languages other than English, sometimes even better in low-resource languages -- and the goal is to select the optimal language for the query. Second, we introduce a variety of simple to strong baselines to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

agarwalishika/LSKExtractor
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.