Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained Language Models
Nora Kassner, Philipp Dufter, Hinrich Sch\"utze

TL;DR
This paper explores the multilingual capabilities of mBERT as a knowledge base across 53 languages, revealing language biases and the benefits of pooling predictions to improve accuracy.
Contribution
It extends knowledge base probing to multiple languages, analyzes language dependence and bias in mBERT, and demonstrates that pooling predictions enhances multilingual knowledge retrieval.
Findings
Performance varies significantly across languages.
Pooling predictions improves overall accuracy.
mBERT exhibits language bias, favoring certain countries.
Abstract
Recently, it has been found that monolingual English language models can be used as knowledge bases. Instead of structural knowledge base queries, masked sentences such as "Paris is the capital of [MASK]" are used as probes. We translate the established benchmarks TREx and GoogleRE into 53 languages. Working with mBERT, we investigate three questions. (i) Can mBERT be used as a multilingual knowledge base? Most prior work only considers English. Extending research to multiple languages is important for diversity and accessibility. (ii) Is mBERT's performance as knowledge base language-independent or does it vary from language to language? (iii) A multilingual model is trained on more text, e.g., mBERT is trained on 104 Wikipedias. Can mBERT leverage this for better performance? We find that using mBERT as a knowledge base yields varying performance across languages and pooling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsmBERT
