Danoliteracy of Generative Large Language Models

S{\o}ren Vejlgaard Holm; Lars Kai Hansen; Martin Carsten Nielsen

arXiv:2410.22839·cs.CL·March 5, 2025

Danoliteracy of Generative Large Language Models

S{\o}ren Vejlgaard Holm, Lars Kai Hansen, Martin Carsten Nielsen

PDF

Open Access

TL;DR

This paper introduces a benchmark to evaluate Danish language and cultural understanding in large language models, revealing high correlation with human feedback and a model consistency factor across diverse scenarios.

Contribution

It presents the first Danish language benchmark for GLLMs, establishing a robust evaluation method and analyzing model consistency in language adaptation.

Findings

01

GPT-4 and Claude Opus achieve highest rankings

02

Benchmark correlates with human feedback at 0.8

03

A strong underlying factor explains 95% of performance variance

Abstract

The language technology moonshot moment of Generative Large Language Models (GLLMs) was not limited to English: These models brought a surge of technological applications, investments, and hype to low-resource languages as well. However, the capabilities of these models in languages such as Danish were, until recently, difficult to verify beyond qualitative demonstrations due to a lack of applicable evaluation corpora. We present a GLLM benchmark to evaluate \emph{Danoliteracy}, a measure of Danish language and cultural competency across eight diverse scenarios such as Danish citizenship tests and abstractive social media question answering. This limited-size benchmark was found to produce a robust ranking that correlates to human feedback at $ρ \sim 0.8$ with GPT-4 and Claude Opus models achieving the highest rankings. Analyzing these model results across scenarios, we find one…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing

MethodsLinear Layer · Dense Connections · Label Smoothing · Layer Normalization · Residual Connection · Graph Self-Attention · Byte Pair Encoding · Absolute Position Encodings · RAdam · Attention Is All You Need