This is the way: designing and compiling LEPISZCZE, a comprehensive NLP benchmark for Polish
{\L}ukasz Augustyniak, Kamil Tagowski, Albert Sawczyn, Denis Janiak,, Roman Bartusiak, Adrian Szymczak, Marcin W\k{a}troba, Arkadiusz Janz, Piotr, Szyma\'nski, Miko{\l}aj Morzy, Tomasz Kajdanowicz, Maciej Piasecki

TL;DR
This paper introduces LEPISZCZE, a comprehensive and flexible benchmark for Polish NLP, addressing the gap in standardized evaluation tools for low-resource languages and providing a blueprint for similar efforts.
Contribution
The paper presents LEPISZCZE, a new extensive benchmark for Polish NLP, including design principles, dataset integration, and initial experimental results, serving as a model for other low-resource languages.
Findings
LEPISZCZE includes 13 experiments with recent Polish language models.
The benchmark incorporates five existing and eight novel datasets.
Insights from creating LEPISZCZE can guide similar benchmarks for other languages.
Abstract
The availability of compute and data to train larger and larger language models increases the demand for robust methods of benchmarking the true progress of LM training. Recent years witnessed significant progress in standardized benchmarking for English. Benchmarks such as GLUE, SuperGLUE, or KILT have become de facto standard tools to compare large language models. Following the trend to replicate GLUE for other languages, the KLEJ benchmark has been released for Polish. In this paper, we evaluate the progress in benchmarking for low-resourced languages. We note that only a handful of languages have such comprehensive benchmarks. We also note the gap in the number of tasks being evaluated by benchmarks for resource-rich English/Chinese and the rest of the world. In this paper, we introduce LEPISZCZE (the Polish word for glew, the Middle English predecessor of glue), a new,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsTest
