The Roles of English in Evaluating Multilingual Language Models

Wessel Poelman; Miryam de Lhoneux

arXiv:2412.08392·cs.CL·December 12, 2024

The Roles of English in Evaluating Multilingual Language Models

Wessel Poelman, Miryam de Lhoneux

PDF

Open Access

TL;DR

This paper discusses the dual roles of English in evaluating multilingual language models, emphasizing the need to shift focus from English as an interface to enhancing language understanding.

Contribution

It clarifies the distinct roles of English in multilingual evaluation and advocates for a focus on language understanding over task performance.

Findings

01

English is used as an interface to improve task performance.

02

Current evaluation methods often conflate interface and understanding roles.

03

A recommendation to prioritize language understanding in evaluations.

Abstract

Multilingual natural language processing is getting increased attention, with numerous models, benchmarks, and methods being released for many languages. English is often used in multilingual evaluation to prompt language models (LMs), mainly to overcome the lack of instruction tuning data in other languages. In this position paper, we lay out two roles of English in multilingual LM evaluations: as an interface and as a natural language. We argue that these roles have different goals: task performance versus language understanding. This discrepancy is highlighted with examples from datasets and evaluation setups. Numerous works explicitly use English as an interface to boost task performance. We recommend to move away from this imprecise method and instead focus on furthering language understanding.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSecond Language Learning and Teaching

MethodsFocus