Language Ideologies in a Multilingual Society: An LLM-based Analysis of Luxembourgish News Comments
Emilia Milano, Alistair Plum, Yves Scherrer, Christoph Purschke

TL;DR
This paper investigates the use of large language models to detect language ideologies in Luxembourgish news comments, highlighting challenges and potential in multilingual, low-resource settings.
Contribution
It evaluates LLM performance on ideological annotation tasks in Luxembourgish and explores translation to improve detection accuracy.
Findings
LLMs can identify language ideological content but are not yet fully optimized.
Translation to high-resource languages can improve LLM performance.
LLMs show promise as practical tools despite current limitations.
Abstract
Detecting language ideologies is a valuable yet complex task for understanding how identities are constructed through discourse. In Luxembourg's multicultural and multilingual society, language ideologies reflect more than simple preferences: they carry deep cultural and social meanings, shaping identities and social belonging. Following recent developments in applying Natural Language Processing tools to linguistics and social science, this paper explores the potential of large language models to assist in the detection of language ideologies. We manually annotate a corpus of user comments in Luxembourgish with predefined ideological categories and then evaluate the performance of large language models under varying prompt conditions to assess their ability to replicate these human annotations. Since Luxembourgish is a small language and poorly represented in the LLMs' training data,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
